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Preface 


This book highlights the latest advances in engineering mathematics with a main 
focus on the mathematical models, structures, concepts, problems and computa- 
tional methods and algorithms most relevant for applications in modern technolo- 
gies and engineering. It addresses mathematical methods of noncommutative 
algebra, applied matrix analysis, operator analysis, probability theory and stochastic 
processes, geometry, computational mathematics, optimization and operations 
research with applications in network analysis, ranking in networks, networks in 
bioinformatics, genetic analysis and cancer research, data mining and classification, 
production logistics optimization. 

The individual chapters cover both theory and applications, and include a wealth 
of figures, schemes, algorithms, tables and results of data analysis and simulation. 
Presenting new methods and results, reviews of cutting-edge research, and open 
problems for future research, they equip readers to develop new mathematical 
methods and concepts of their own, and to further compare and analyze the methods 
and results discussed. 

Chapter “Classification of Low Dimensional 3-Lie Superalgebras” by Viktor 
Abramov and Priit Latt is concerned with extension of a notion of n-Lie algebra to 
Zo-graded structures by means of a graded Filippov identity giving a notion of 
n-Lie superalgebra. Classification of low dimensional 3-Lie superalgebras is pro- 
posed, and it is shown that given an n-Lie superalgebra equipped with a supertrace 
one can construct the (n+ 1)-Lie superalgebra which is referred to as the induced 
(n+ 1)-Lie superalgebra. Based on Clifford algebra which, when endowed with a 
Zy-graded structure and a graded commutator, can be viewed as the Lie superal- 
gebra and supertrace defined via its matrix representation, the 3-Lie superalgebras 
are constructed and explicitly described by their ternary commutators. In 
Chap. “Semi-Commutative Galois Extension and Reduced Quantum Plane” by 
Viktor Abramov and Md. Raknuzzaman, it is shown that a semi-commutative 
Galois extension of associative unital algebra by means of an element t, which 
satisfies tY = 1 (1 is the identity element of an algebra and N >2 is an integer) 
induces a structure of graded q-differential algebra, where g is a primitive Nth 
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root of unity. The graded q-differential algebra is constructed and its first order 
noncommutative differential calculus is studied. Moreover, the higher order non- 
commutative differential calculus induced by a semi-commutative Galois extension 
of associative unital algebra is studied, and it is shown that a reduced quantum 
plane can be viewed as a semi-commutative Galois extension of a fractional 
one-dimensional space. Chapter “Valued Custom Skew Fields with Generalised 
PBW Property from Power Series Construction” describes an interesting con- 
struction of associative algebras with a number of useful properties. The con- 
struction is basically that of a power series algebra with given commutation 
relation. The constructed algebras have a Poincaré—Birkhoff—Witt type basis, are 
equipped with a norm (actually an ultranorm) that is trivial to compute for basis 
elements, are topologically complete, and satisfy their given commutation relation. 
In addition, parameters can be chosen so that the algebras will in fact turn out to be 
skew fields and the norms become valuations. Chapter “Computing Burchnall- 
Chaundy Polynomials with Determinants” by Johan Richter and Sergei Silvestrov 
concerned with generalization of a method of computing the Burchnall—-Chaundy 
polynomial of two commuting differential operators based on Burchnall—Chaundy 
eliminant determinant construction to the class of rings known as Ore extensions. It 
is shown that the eliminant construction partially generalizes and also counterex- 
amples showing that these generalizations do not always retain all desired prop- 
erties are provided. In Chap. “Centralizers and Pseudo-Degree Functions” by Johan 
Richter, a generalization of a proof of certain results by Hellstr6m and Silvestrov on 
centralizers in graded algebras is presented, centralizers in certain algebras with 
valuations are considered and a proof that the centralizer of an element in these 
algebras is a free module over a certain ring is given. Under further assumptions it is 
also shown that the centralizer is also commutative. In Chap. “Crossed Product 
Algebras for Piece-Wise Constant Functions” by Johan Richter, Sergei Silvestrov, 
Vincent Ssembatya and Alex Behakanira Tumwesigye, algebras of functions that 
are constant on the sets of a partition are considered together with their crossed 
product algebras with the group of integers and the commutant of the function 
algebra in the crossed product algebra. In Chap. “Commutants in Crossed Product 
Algebras for Piece-Wise Constant Functions” by Johan Richter, Sergei Silvestrov 
and Alex Behakanira Tumwesigye, crossed product algebras of algebras of 
piece-wise constant functions on the real line with the group of integers are con- 
sidered, and for an increasing sequence of algebras the set difference between the 
corresponding commutants is described. 

Chapter “Asymptotic Expansions for Moment Functionals of Perturbed Discrete 
Time Semi-Markov Processes” by Mikael Petersson is devoted to the study of 
moment functionals of mixed power-exponential type for nonlinearly perturbed 
semi-Markov processes in discrete time. Conditions under which the moment 
functionals of interest can be expanded in asymptotic power series with respect to 
the perturbation parameter are given and it is shown how the coefficients in these 
expansions can be computed from explicit recursive formulas. The results of 
this chapter have applications for studies of quasi-stationary distributions. 
In Chap. “Asymptotics for Quasi-Stationary Distributions of Perturbed Discrete 
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Time Semi-Markov Processes” by Mikael Petersson, quasi-stationary distributions 
of nonlinearly perturbed semi-Markov processes in discrete time are studied. This 
type of distributions is of interest for analysis of stochastic systems which have 
finite lifetimes but are expected to persist for a long time. Asymptotic power series 
expansions for quasi-stationary distributions are obtained, it is shown how the 
coefficients in these expansions can be computed from a recursive algorithm, and a 
numerical example for a discrete time Markov chain is presented as an illustration 
of this algorithm. Chapter “Asymptotic Expansions for Stationary Distributions of 
Perturbed Semi-Markov Processes” by Dmitrii Silvestrov and Sergei Silvestrov 
presents new algorithms for computing asymptotic expansions for stationary dis- 
tributions of nonlinearly perturbed semi-Markov processes based on special tech- 
niques of sequential phase space reduction, which can be applied to processes with 
asymptotically coupled and uncoupled finite phase spaces. Chapter “PageRank, a 
Look at Small Changes in a Line of Nodes and the Complete Graph” is about the 
PageRank algorithm used as part of the ranking process of different Internet pages 
in search engines, ranking in citation networks as well as other information, 
communication and big data networks. The chapter focuses on the behavior of 
PageRank as the system dynamically changes either by contracting or expanding 
such as when subtracting or adding nodes or links or groups of nodes or links. 
PageRank is considered as the solution of a linear system of equations and 
examined in both the ordinary normalized version of PageRank as well as the 
non-normalized version, and explicit formulas for the PageRank of some simple 
link structures are obtained. Chapter “PageRank, Connecting a Line of Nodes with 
a Complete Graph” is focused on the PageRank algorithm following original def- 
inition of PageRank by Sergey Brin and Larry Page as the stationary distribution of 
a certain random walk on a graph used to rank homepages on the Internet. 
Specifically, this chapter is concerned with PageRank changes after adding or 
removing edge between otherwise disjoint subgraphs, for example link structures 
consisting of a line of nodes or a complete graph and different ways to combine the 
two. Both the ordinary normalized version of PageRank as well as a 
non-normalized version of PageRank can be found by solving corresponding linear 
system, and it is demonstrated that it is possible to find moreover explicit formulas 
for the PageRank in some simple link structures and using these formulas take a 
more in-depth look at the behavior of the ranking as the system changes. Chapter 
“Graph Centrality Based Prediction of Cancer Genes” by Holger Weishaupt, Patrik 
Johansson, Christopher Engstr6m, Sven Nelander, Sergei Silvestrov and Fredrik 
J. Swartling focuses on how graph centralities obtained from biological networks 
have been used to predict cancer genes. As current cancer therapies including 
surgery, radiotherapy and chemotherapy are often plagued by high failure rates, 
designing more targeted and personalized treatment strategies requires a detailed 
understanding of druggable tumor driver genes. Specifically, the chapter begins 
with describing the current problems in cancer therapy and the reasoning behind 
using network based cancer gene prediction, followed by an outline of biological 
networks, their generation and properties, and finely by a review of major concepts, 
recent results as well as future challenges regarding the use of graph centralities in 


Vili Preface 


cancer gene prediction. Chapter “Output Rate Variation Problem: Some Heuristic 
Paradigms and Dynamic Programming” by Gyan Bahadur Thapa and Sergei 
Silvestrov is concerned with the output rate variation problem, which is one of the 
important research directions in the area of multi-level just-in-time production 
systems. A short survey of the mathematical models of this problem is provided 
together with consideration of its NP-hardness, a brief review of heuristic 
approaches to the problem, the discussion on the dynamic programming approach 
and pegging assumption reducing the multi-level problem to weighted single-level 
problem as well as some open problems. 

In Chap. “Z’-Boundedness of Two Singular Integral Operators of Convolution 
Type” by Sten Kaijser and John Musonda, boundedness properties investigated for 
two singular integral operators defined on L?-spaces (1<p<oo) on the real line, 
both as convolution operators on Z?(R) and on the weighted spaces L?(w), where 
w(x) = 1/(2cosh$x). In the Chap. “Fractional-Wavelet Analysis of Positive 


definite Distributions and Wavelets on Y'(C)” by Emanuel Guariglia and Sergei 
Silvestrov, a wavelet expansion theory for positive definite distributions over the 
real line is considered and a fractional derivative operator for complex functions in 
the distribution sense is defined. The Ortigueira—Caputo fractional derivative 
operator is rewritten as a convolution according to the fractional calculus of real 
distributions, and the fractional derivatives of the complex Shannon wavelet and 
Gabor—Morlet wavelet are computed together with their plots and main properties. 
Chapters “Linear Classification of Data with Support Vector Machines and 
Generalized Support Vector Machines” and “Linear and Nonlinear Classifiers of 
Data with Support Vector Machines and Generalized Support Vector Machines” by 
Talat Nazir, Xiaomin Qi and Sergei Silvestrov are devoted to support vector 
machine for linear and nonlinear classification of data. Generalized support vector 
machine for classification of data is introduced, and it is shown that the problem of 
generalized support vector machine is equivalent to the problem of generalized 
variational inequality. Various results for the existence of solutions are established 
and several examples are constructed. In Chaps. “Common Fixed Points of Weakly 
Commuting Multivalued Mappings on a Domain of Sets Endowed with Directed 
Graph” and “Common Fixed Point Results for Family of Generalized Multivalued 
F-contraction Mappings in Ordered Metric Spaces” by Talat Nazir and Sergei 
Silvestrov, the existence of coincidence points and common fixed points for mul- 
tivalued mappings satisfying certain graphic W-contraction contractive conditions 
with set-valued domain endowed with a graph, without appealing to continuity, is 
established, the existence of common fixed points of family of multivalued map- 
pings satisfying generalized F-contractive conditions in ordered metric spaces is 
also investigated. 

The book consists of carefully selected and refereed contributed chapters cov- 
ering research developed as a result of a focused international seminar series on 
mathematics and applied mathematics and a series of three focused international 
research workshops on engineering mathematics organized by the Research 
Environment in Mathematics and Applied Mathematics at Malardalen University 
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from autumn 2014 to autumn 2015: the International Workshop on Engineering 
Mathematics for Electromagnetics and Health Technology; the International 
Workshop on Engineering Mathematics, Algebra, Analysis and Electromagnetics; 
and the Ist Swedish-Estonian International Workshop on Engineering Mathematics, 
Algebra, Analysis and Applications. 

This book project has been realized, thanks to the strategic support by 
Malardalen University to the research and research education in Mathematics, 
which is conducted by the research environment Mathematics and Applied 
Mathematics (MAM), in the established research area of Educational Sciences and 
Mathematics at the School of Education, Culture and Communication at Malardalen 
University. We are grateful also to the EU Erasmus Mundus projects FUSION, 
EUROWEB and IDEAS, the Swedish International Development Cooperation 
Agency (Sida) and International Science Programme in Mathematical Sciences, 
Swedish Mathematical Society, Linda Peetre Memorial Foundation, as well as other 
national and international funding organizations and the research and education 
environments and institutions of the individual researchers and research teams that 
contributed to this book. 

We hope that this book will serve as a source of inspiration for a broad spectrum 
of researchers and research students in mathematics and applied mathematics, as 
well as in the areas of applications of mathematics considered in the book. 


Vasteras, Sweden Sergei Silvestrov 
July 2016 Milica Ranci¢ 
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Classification of Low Dimensional 3-Lie 
Superalgebras 


Viktor Abramov and Priit Latt 


Abstract A notion of n-Lie algebra introduced by V.T. Filippov can be viewed 
as a generalization of a concept of binary Lie algebra to the algebras with n-ary 
multiplication law. A notion of Lie algebra can be extended to Z-graded structures 
giving a notion of Lie superalgebra. Analogously a notion of n-Lie algebra can be 
extended to Z2-graded structures by means of a graded Filippov identity giving a 
notion of n-Lie superalgebra. We propose a classification of low dimensional 3-Lie 
superalgebras. We show that given an n-Lie superalgebra equipped with a supertrace 
one can construct the (nm + 1)-Lie superalgebra which is referred to as the induced 
(n + 1)-Lie superalgebra. A Clifford algebra endowed with a Z-graded structure 
and a graded commutator can be viewed as the Lie superalgebra. It is well known 
that this Lie superalgebra has a matrix representation which allows to introduce a 
supertrace. We apply the method of induced Lie superalgebras to a Clifford algebra 
to construct the 3-Lie superalgebras and give their explicit description by ternary 
commutators. 


Keywords n-Lie algebras - n-Lie superalgebras - Clifford algebras - Induced n-Lie 
superalgebras 


1 Introduction 


Recently, there was markedly increased interest of theoretical physics towards the 
algebras with n-ary multiplication law. Due to the fact that the Lie algebras play a 
crucial role in theoretical physics, it seems that development of n-ary analog of a 
concept of Lie algebra is especially important. In [5] V.T. Filippov proposed a notion 
of n-Lie algebra which can be considered as a possible generalization of a concept 
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of Lie algebra to structures with n-ary multiplication law. In approach proposed by 
V.T. Filippov an n-ary commutator of n-Lie algebra is skew-symmetric and satisfies 
an n-ary analog of Jacobi identity which is now called Filippov identity. It is worth to 
mention that there is an approach different from the one proposed by V.T. Filippov, 
where a ternary commutator is not skew-symmetric but it obeys a symmetry based on 
a representation of the group of cyclic permutations Z3 by cubic roots of unity [2]. It 
is well known that a concept of Lie algebra can be extended to Z2-graded structures 
with the help of graded commutator and graded Jacoby identity, and a corresponding 
structure is known under the name of Lie superalgebra. 

In the present paper we show that a notion of n-Lie algebra proposed by 
V.T. Filippov can be extended to Z -graded structures by means of graded 
n-commutator and a graded analog of Filippov identity. This Z.-graded n-Lie alge- 
bra will be referred to as a n-Lie superalgebra. We show that a method of induced 
n-Lie algebras proposed in [3] and based on an analog of a trace can be applied to 
n-Lie superalgebras if instead of a trace we will be using a supertrace. We introduce 
the notions such as an ideal of n-Lie superalgebra, subalgebra of n-Lie superalgebra, 
descending series and prove several results analogous to the results proved in [3] for 
n-Lie algebras. We propose a classification of low dimensional 3-Lie superalgebras 
and find their commutation relations. A Clifford algebra can be used to construct a 
Lie superalgebra if one equips it with a graded commutator. This Lie superalgebra 
has a matrix representation called supermodule of spinors and this representation 
can be endowed with a supertrace. Thus we have all basic components of a method 
of induced n-Lie superalgebras and applying this method we construct a series of 
3-Lie superalgebras. 


2 Supertrace and Induced n-Lie Superalgebras 


A notion of Lie algebra can be extended from binary algebras to algebras with n-ary 
multiplication law with the help of a notion of n-Lie algebra, where n is any integer 
greater or equal to 2. This approach was proposed by V. T. Filippov in [5], and it is 
based on n-ary analog of Jacobi identity which is now called the Filippov identity. 
It is well known that a concept of binary Lie algebra can be extended to Z2-graded 
structures giving a notion of Lie superalgebra. Similarly a notion of n-Lie algebra 
can be extended to Z»-graded structures giving a structure which we call an n-Lie 
superalgebra [1, 4]. In this section we give the definitions of n-Lie algebra, n-Lie 
superalgebra and show that a structure of induced n-Lie algebra based on an analog 
of trace [3] can be extended to n-Lie superalgebras with the help of supertrace. 


Definition 1 Vector space g endowed with a mapping [-,...,-] : g” — g is said to 
be an-Lie algebra, if [-,..., -] is n-linear, skew-symmetric and satisfies the identity 


n 


[Kies 5 X= [y1, -es Val] = bi w85-9: LL} 6005 WnEt3 yil, vas Maly (1) 
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where X1,..-,%n—1; Vi,---, Yn © G- 


In the definition of n-Lie algebra the identity (1) is called the Filippov identity 
[5]. It is clear that for n = 2 Filippov identity yields the classical Jacobi identity of 
a binary Lie algebra. 


Definition 2 Let ¢ : g” — g.Alinearmapt : g ~ Kwill be referred to as a g-trace 
if tT (O(x1,.--,X,)) = O for all x1,...,x, € g. 


In [3] the authors proposed a method based on ¢-trace which can be applied to 
an n-Lie algebra to construct the (m + 1)-Lie algebra. 


Theorem 1 Let (g, [-,...,-]) be ann-Lie algebra and t bea|[-,..., -]-trace. Define 
usc ag —> gby 


n+l 


Mice = > ED TO hiatal 2) 


i=l 
Then (g, [-,.--, lz) is the (n + 1)-Lie algebra. 


It was shown in [1] that a similar method based on a notion of a supertrace can be 
used in the case of n-Lie superagebras, and given an n-Lie superalgebra one can apply 
this method to induce the (n + 1)-Lie superalgebra. Let us remind that super vector 
space is a direct sum of two vector spaces, i.e. V = Vj © V;. The dimension of finite 
dimensional super vector space is denoted as m|n if dimVg = m and dimV>5 =n. 
Element x € V \ {0} is said to be homogeneous if either x € Vo or x € Vj. For 
homogeneous elements we can define parity by 


(3) 


In what follows we will assume that element x is homogeneous whenever |x| is 
used. 


Definition 3 We say that super vector space g endowed with n-linear map [-,..., -] : 
g” — gisn-Lie superalgebra if for all x1,...,Xn, V1,--+;Yn-1 € GY 
1. ILx1, cee Xn] = pa |x;l, 
2. [x1, sang Xi, Xi41y +e es Xn] = — (=D lhl, see y Xi41, Xi, ++ Xn], 
oe [Visinvey Yaar [x1,---, Xn] = 
= SCE Daly [x1, SAF § Xi-1; [y, eae } Yn-1> xi], Xi+1; ae Xn], 
where X = (x1,--.5%n)s Y= (My -++s Yn—1), and [xi] = DOj_1 |xyl- 


Definition 4 Let V = Vp © V; bea super vector space and let ¢ : V” > V. We say 
that linear map S : V > Kis a $-supertrace if 
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1. S(@(,..-,Xn)) = 0 for all x1,...,%) € V, 
2. S(x) = 0 for all x € Vj. 
Given an n-Lie superalgebra endowed with a supertrace (which satisfies the con- 


ditions of the previous definition with respect to a graded commutator of this algebra) 
we can construct the (7 + 1)-Lie superalgebra by means of a method described in [1]. 


Theorem 2 Let (g,[-,...,-]) be a n-Lie superalgebra and let S:g —> K bea 
[-,..., -]-supertrace. Define [-,...,-]s : g"*! > g by 
n+l 
[X1,---,Xn+i]s = DSEvTt ED s@pia, very MRT, Mpls <>35 avi l: 
i=1 
(4) 
Then (g,[-,..-,-ls) isa (n + 1)-Lie superalgebra. 


3 Properties of Induced n-Lie Superalgebras 


In this section we study a structure of induced n-Lie superalgebra, and introducing 
the notions such as ideal of an n-Lie superalgebra, derived series, subalgebra of n-Lie 
superalgebra we prove several results which are analogous to the results proved in 
[3] in the case of n-Lie algebras. 


Definition 5 Let (g, [-, ...,-]) be an-Lie superalgebra and let h be a subspace of g. 
We say that 6 is an ideal of g, if for all h € § and for all x,,...,x,_1 € g, it holds 
that [h, x1,..., Xn-1] € 6. 

Definition 6 Let (g,[-,...,-]) be a n-Lie superalgebra and let h be an ideal of g. 


Derived series of h is defined as 
D°(b)=h and D?*'(h) =[D(b),...,D'(b)], PEN, 
and the descending central series of h as 
C°(o) =h and C?*'(h) =[C7(),5,....61 peEN. 


An ideal 6 of n-Lie superalgebra g is said to be solvable if there exists p € N such 
that D?(h) = {0}, and we call h nilpotent if C’(h) = {0} for some p EN. 


Proposition 1 Let (g, [-,...,-]) be an-Lie superalgebra and let h C g be a subal- 
gebra. If S is supertrace of [-,..., +], then § is also subalgebra of (g,[-,..-.,-]s): 

Proof Leth be subalgebra of n-Lie superalgebra (g, [-,...,-]),%1,---,Xn+1 € Gand 
assume S is a supertrace of [-,..., -]. Then [x1, ..., X,41]s is a linear combination 


of elements of § as desired. 
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Proposition 2 Let h be an ideal of (g,[-,...,-]) and assume S is supertrace of 
[-,...,-]. Then h is ideal of (g,[-,...,-]s) if and only if [g,g,...,9] Gb or h CS 
ker S. 


Proof Leth € h and x1,...,xX, € g. Then 


n 
[e1,..-.%n Aly = > CDP C1 Pll sea yiey, tt tides am hI+ 6) 
i=1 


(-1)"(-1) #1 s(n) x1, .... 201. © 


Since ff is ideal we have [x,,..., Xj;-1, Xi41,---,%,h] € § foralli=1,...,n. 
Thus [x1,..., Xn, A]s € 6 is equivalent to 


1-1) SG) ii. os. 0a) © 5, 


which clearly holds when S(h) = 0 or [x1,...,Xn] € 6. 


Proposition 3 Let (g, [-,...,-]) be n-Lie superalgebra and let S be supertrace of 
[-,..., +]. Then induced (n + 1)-Lie superalgebra (gs, [-,...,-]s) is solvable. 
Proof Assume (g, [-,...,-]) isan-Lie superalgebra and S is supertrace of [-,..., -], 
and let x1,...,Xn41 € D'(gs). 

Then foreveryi = 1,...,n + 1wehavex},...,x?*' © gsuchthatx; = [x},..., 
x ss in which case 

[X1,---,%n41ls = 

n+1 


Dy EAS anesthe me he el =O 


i=1 


In the light of the last proposition we can immediately see that if (g, [-,..., -]) 
is an n-Lie superalgebra, then for the induced (n + 1)-Lie superalgebra it holds 
D? (gs) = {0}, whenever p > 2. 


Proposition 4 Let (g,[-,...,-]) be n-Lie superalgebra, S supertrace of [-,..., +] 
and assume (n+ 1)-Lie superalgebra (gs,[-,...,-]s) is induced by §. Denote 
descending central series of g by (C? (g)) 50 and denote descending central series 
of 9s by (C?(gs))Fo. Then 


C’(gs) € C?(g) forall peEN. 
If there exists g € g such that[g,x1,...,Xn]s = [%1,.--,Xn] holds forall x,,x2,..., 


Xn € g, then 
C’(gs) =C?(g) forall peEN. 
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Proof Case p = 0 is trivial. Note that for p = 1 any x = [x1,..., Xn+ils € C!(gs) 
can be expressed as 


n+1 
x= VEE p* so) bn, ee Fees Pere eee 


i=1 


meaning x is a linear combination in C!(g). 


Assume now that there exists g € g such that for all yj,..., y, € g it holds that 
[g,¥1.--->¥nls =Dy.---> Yn]. Then x = [x1,...,X,] € C'(g) can be written as 
[g,X1,---,Xn]s and thus x € C!(gs). 

Next assume that the statement holds for some p € Nand let x € C?*+! (gs). Then 
there are x;,...,X, € gand g € C’(gs) such that 

X=([8,%1,---.%nls = eal eee oe 8ls = 


= ye Sa SG) iticce eres iene el, 


i=l 


since g can be expressed as a bracket of some elements, and hence S(g) = 0. On 
the other hand, as g € C’(gs), by our inductive assumption g € C’(g), and thus 
x eCrtl(g), 


To complete the proof, assume that there exists g € g such that forall yj,..., y, € 
g equality [g, y1,.-., ls =[y1,---, yn] holds. If x € cCPt!(g), then x = [h, x, 
...,X,—1], where x1,...,%,-1 € gandh € C?(g). Altogether we have 
x= [h, X1yeees Xn—-1] = Lg, h, X1yeees Xn—-1]s = —(—1)'8!l"I[n, 8 X1,--55 Xn—-1]s- 
At the same time  € C?(g) = C? (gs), which gives us [h, g,x1,...,Xn-il]s € 


C?*!(gs), meaning x € C?*t! (gs), as desired. 


4 Low Dimensional Ternary Lie Superalgebras 


In this section we propose a classification of low dimensional ternary Lie superalge- 
bras. 

First of all we find the number of different (non-isomorphic) 3-Lie superalgebras 
over C of dimension m|n where m+n <5. We also find the explicit commuta- 
tion relations of these 3-Lie superalgebras. We use a method which is based on the 
structure constants of an n-Lie superalgebra. 


Definition 7 Let g = gg © gz be n-Lie superalgebra, denote 


B= {ey,..6,@m; fisre<s5 Fa} 
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x 7 B 
and assume {e1,...,@m} spans gg and {fi,..., fn} spans gy. Elements Ky) 4. 
defined by 
B 
[Zays +--+ Z4,] = Ka, 4 ZB, 
where z4,,...,ZA,,ZB € B, are said to be structure constants of g with respect to B. 


Assume we have a 3-Lie superalgebra (g,[-,-,-]) of dimension m|n over C. 
Denote 


B= {eij08.4@ns Piseses fa} HAs Sman} 


and assume é,, | < a < _m, spans the even part of g and f;, 1 <i <n, spans the 
odd part of g. Additionaly, let z4 = e4, when 1 < A < m, and z4 = f4—m, when 
m<A<m-n. Since |[z, Z2, 23]| = |Z1| + |z2| + |z3| we can express the values 
of commutator on generators using structure constants in the following form: 


Cas eB, Si _ Kipit i: 
lea, fis fil = Kyijeb> 
[fis fis Sel = Kin. 


wherea < 6 < yandi < j <k. Asallother possible orderings and combinations of 
generators can be transformed into one of these four forms by graded skew-symmetry 
of [-, -, -], we will not consider them. 

As a next step we can eliminate the combinations that are trivial. To find such 
brackets we can observe different permutations of arguments. If some permutation 
yields the initial ordering without preserving the sign, then this bracket must be zero, 
as in 


[e1,e1, fil = —(—1)!"""[e1, e1, Al = —[e1, €1, fil. 


Finally we can use the graded Filippov identity. Observe [z4, zg, zc] = K on 
Zp #0, where 1 < A < B< C <m-+n, and calculate 


[Ze, ZF, (ZA, ZB, Zc]] 


using two different paths. Firstly use what is known and write 


D 
[ze, Zr, (ZA, 2B, Zcl] = KypclZe, ZF; Zp). 


Then transform bracket [ze,zr,zp] to (—1)°""[zp, ze, zr], where 
{D, E, F} ={D’, E', F’}, but D! < E’ < F’, and (—1)°2" gives the sign that 
comes from graded skew-symmetry. Note that [zp, Zz, zr] can be expressed 
using structure constants and generators as well and thus we have [Zp’, Zz, Zr] = 
K EY r'p’<H» Which means that on the one hand 


D H 
(Ze, ZF; Za; ZB, Zc]] = (1 Rok pepe 


8 V. Abramov and P. Litt 
On the other hand we can use Filippov identity to calculate [zz, zr, [za, ZB, Zc]]: 


[ze. zr, (za, zB, 2c]] = [Ize, zr. za], zB. 20] + (- DPA El FD [2 4, (ze, zr, zB], 2c] + 


(-1) eal 4 lea D(lzel+lzrD) [za. ZB Ize; aon zcl.] 


In every summand we can apply the same construction as described above. To 
do that, let us denote zap = |Zal(IZe| + |Zrl) and Zager = (lZal + 128) (IZel + 
|Zr|). Now reorder the arguments in increasing order and replace the result with 
linear combination of generators and structure constants. By doing so we end up 
having 


[ze, Zr. [Za ZB, cl] = (—1) OPFOR KE ee Kgicig Lat 
(—1) 488+ Oner +Oarera! Kg 


s ktyiger a G H 
(-1) ABEF +OCEF+Oa’'BIG Kogep Kapa ZH. 


A 
ACG) <A 


In other words the following system of quadratic equations emerges: 


cin KancK perc = or ella Khe pK pcg tat 
— ZAEF BEF A'c'G! 
. RE aa — 
C'E'F Sa’ BiG’ SH» 
where generators zy; are known and structure constants K7,¢ are unknown. Further 
more, for every H € {1,2,...,m-+n} we have 


(10 RR ac K pipe = (OOO KE py Koa t 
—1)<4ert+Osert+Oac'a KG A 
ME a Saget 
C'E'F'SA'BIG?? 

In summary, we have a system of quadratic equations whose solutions are possible 
structure constants for m|n-dimensional 3-Lie superalgebra. We note however, that 
the structure constants are depending on the choice of basis for the super vector space 
and thus invariant solutions have to be removed case by case. 

Applying the described algorithm to concrete cases gives us the following theo- 
rems. 


Theorem 3 3-Lie superalgebras over C, whose super vector space dimension is 0|1 
or 1|1, is Abelian. 


Theorem 4 3-Lie superalgebras over C, whose super vector space dimension is 0|2 
or 1|2, are either Abelian or isomorphic to 3-Lie superalgebra h whose non-trivial 
commutation relations are 
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[fi. fi, fil=—-fi+ fr, 
[fi. fi, hl=-fat+ ha, 
Lf, fa, Al=—-fi+ fr; 
[he fo, fl =—-fit fr. 


or [fi fi, fil= he, 


where f, f2 are odd generators of b. 


Theorem 5 3-Lie superalgebras over C, whose super vector space dimension is 
2|1, are either Abelian or isomorphic to 3-Lie superalgebra h whose non-trivial 
commutation relations are 


ler, fi, fil=e1 +e, 


a Ae eee la,ea,fl= fi, or (fi, fi. Al= fi, 


where e,, €2 are even generators of h and f\ is odd generator of }. 


5 Supermodule Over Clifford Algebra 


In this section we apply the method described in Sect. 2 to a Clifford algebra. It is well 
known that a Clifford algebra can be equipped with the structure of superalgebra if 
one associates degree | to each generator of Clifford algebra and defines the degree of 
product of generators as the sum of degrees of its factors. Then making use of a graded 
commutator we can consider a Clifford algebra as the Lie superalgebra. A Clifford 
algebra has a matrix representation and this allows to introduce a supertrace. Hence 
we have a Lie superalgebra endowed with a supertrace, and we can apply the method 
described in Theorem 2 to construct the 3-Lie superalgebra. In this section we will 
give an explicit description of the structure of this constructed 3-Lie superalgebra. 

A Clifford algebra C,, is the unital associative algebra over C generated by 
Vi, ¥2;+-+, Yn Which obey the relations 


ViVi +tvjiV=25je, 1,7 =1,2,...,n, (7) 
where e is the unit element of Clifford algebra. Let N = {1, 2,...,} be the set of 
integers from | ton. If J isa subset of N, i.e. J = {i,, i2,..., ig} where 1 < i) < in < 


+++ <i, <n, then one can associate to this subset J the monomial y; = yj, Vi, . - - Vi,- 
If J = @ one defines yg = e. The number of elements of a subset J will be denoted 
by |Z]. It is obvious that the vector space of Clifford algebra C,, is spanned by the 
monomials y;, where J C N. Hence the dimension of this vector space is 2” and any 
element x € C,, can be expressed in terms of these monomials as 


x= a aryl, 


ICN 
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where a; = dj,i,..;, is acomplex number. It is easy to see that one can endow a Clifford 
algebra C,, with the Z2-graded structure by assigning the degree |y;| = |/| (mod 2) 
to monomial y;. Then a Clifford algebra C,, can be considered as the superalgebra 
since for any two monomials it holds |y7yv7| = |yvr| + |vvl. 

Another way to construct this superalgebra which does not contain explicit refer- 
ence to Clifford algebra is given by the following theorem. 


Theorem 6 Let I be a subset of N = {1,2,...,n}, and y; be a symbol associated 
to I. Let C,, be the vector space spanned by the symbols y;. Define the degree of y; 
by |y1| = |I|(mod 2), where |I| is the number of elements of I, and the product of 
V1, as by 

vs = (1) vray, (8) 


where o I, J) = ae, o(1, j), oC, j) is the number of elements of I which are 
greater than j € J, and IAJ is the symmetric difference of two subsets. Then C,, is 
the unital associative superalgebra, where the unit element e is yg. 


This theorem can be proved by means of the properties of symmetric difference 
of two subsets. We remind a reader that the symmetric difference is commutative 
I®J=J@1, associative (IAJ)AK = IA(JAK) and [A% = GAT. The latter 
shows that yg is the unit element of this superalgebra. The symmetric difference also 
satisfies |[ AJ| = |/| + |J| (mod 2). Hence C,, is the superalgebra. 

The superalgebra C,, can be considered as the super Lie algebra if for any two 
homogeneous elements x, y of this superalgebra one introduces the graded commu- 
tator [x, y] = xy — (—1)#!)lyx and extends it by linearity to a whole superalgebra 
C,. We will denote this super Lie algebra by €,,. Then {y7};cn are the generators 
of this super Lie algebra €,,, and its structure is entirely determined by the graded 
commutators of y;. Then for any two generators y;, yy we have 


[vn w= fG YD) vas, (9) 
where f (J, /) is the integer-valued function of two subsets of N defined by 
AEDs Cyr a— Cpe), 


It is easy to verify that the degree of graded commutator is consistent with the degrees 
of generators, i.e. [y7, yy] = |vr| + |v,|. Indeed the function o (/, J) satisfies 


oU,J) = |Z ||\J1-|2OJ|-oG, J), 
and 


fID = (“7% (1 = (HD) 
= (ten _ (—1y™1) 


= (-DINT 17 (( yyloi 1) = —( DFT FZ, J). 
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Hence [y;, yy] = —(—1)!"""![y,, v7] which shows that the relation (9) is consis- 
tent with the symmetries of graded commutator. It is obvious that if the intersection of 
subsets 7, J contains an even number of elements then f(/, J) = 0, and the graded 
commutator of y;, y, is trivial. Particularly if at least one of two subsets J, J is the 
empty set then f(/, J) = 0. Thus any graded commutator (9) containing e is trivial. 

As an example, consider the super Lie algebra €2. Its underlying vector space is 
4-dimensional and €) is generated by two even degree generators e, 12 and two odd 
degree generators y;, 72. The non-trivial relations of this Lie superalgebra are given 
by 


IM. vl =n.) =2e. YW. ve) =2”, Iv, v2] = -2-. (10) 


Now we assume that n = 2m, m > | is an even integer. The Lie superalgebra €,, 
has a matrix representation which can be described as follows. Fix n = 2 and identify 
the generators 7, y2 with the Pauli matrices 01, 02, i.e. 


n=(Go) = (Co): (1) 


Then yj = y) ¥2 = i 03 where 


Let S? be the 2-dimensional complex super vector space C? with the odd degree 
operators (11), where the Z2-graded structure of S? is determined by 03 =i yp. 
Then C, ~ End (S?), and S* can be considered as a supermodule over the superal- 
gebra C2. Let $” = S? ® S?@...@ S*(m — times). Then S” can be viewed as a 
supermodule over the m-fold tensor product of C2, which can be identified with C,, 
by identifying y;, y2 in the jth factor with y2;_1, y2; in C,. This C,-supermodule S$” 
is called the supermodule of spinors [6]. Hence we have the matrix representation 
for the Clifford algebra C,,, and this matrix representation or supermodule of spinors 
allows one to consider the supertrace, and it can be proved [6] that 


arin) [yy TS an 

Now we have the Lie superalgebra €,, with the graded commutator defined in 
(9) and its matrix representation based on the supermodule of spinors. Hence we 
can construct a 3-Lie superalgebra by making use of graded ternary commutator (4). 
Applying the formula (4) we define the graded ternary commutator for any triple 
V1, Vs, YK Of elements of basis for €,, by 
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[v7, v7. Ye] = Str(v) Ly, ve] — (-12)"""'Str(yz) [v7 ve] 
+(-DAIVM Stroyg) [yr vl, (13) 


where the binary graded commutator at the right-hand side of this formula is defined 
by (9). According to Theorem 2 the vector space spanned by y;, J C N and equipped 
with the ternary graded commutator (13) is the 3-Lie superalgebra which will be 
denoted by €°. Making use of (9) we can write the expression at the right-hand side 
of the above formula in the form 


[v7.7 Ve] = f (J, K)Str(7) yrax — (—1)" fC, K)Ste(y7) yrak 
+(-DIAIUED FCT, D)Str(yx) yras- 


From the formula for supertrace (12) it follows immediately that the above graded 
ternary commutator is trivial if none of subsets y;, yy, yx is equal to N. Similarly 
this graded ternary commutator is also trivial if all three subsets J, J, K are equal to 
N, ie. = J = K =N, or two of them are equal to N. 


Proposition 5 The graded ternary commutators of the generators y;, I C N of the 
3-Lie superalgebra €°) are given by 


ae Dyias fT AN, JAN, K=N, 


in all other cases . 


[v1. Vs, YK] = (14) 
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Semi-commutative Galois Extension 
and Reduced Quantum Plane 


Viktor Abramoy and Md. Raknuzzaman 


Abstract In this paper we show that a semi-commutative Galois extension of asso- 
ciative unital algebra by means of an element t, which satisfies t’ = 1 (1 is the 
identity element of an algebra and N > 2 is an integer) induces a structure of graded 
q-differential algebra, where q is a primitive Nth root of unity. A graded q-differential 
algebra with differential d, which satisfies dN =0,N > 2, can be viewed as a gen- 
eralization of graded differential algebra. The subalgebra of elements of degree zero 
and the subspace of elements of degree one of a graded q-differential algebra together 
with a differential d can be considered as a first order noncommutative differential 
calculus. In this paper we assume that we are given a semi-commutative Galois 
extension of associative unital algebra, then we show how one can construct the 
graded q-differential algebra and when this algebra is constructed we study its first 
order noncommutative differential calculus. We also study the subspaces of graded 
q-differential algebra of degree greater than one which we call the higher order non- 
commutative differential calculus induced by a semi-commutative Galois extension 
of associative unital algebra. We also study the subspaces of graded q-differential 
algebra of degree greater than one which we call the higher order noncommutative 
differential calculus induced by a semi-commutative Galois extension of associative 
unital algebra. Finally we show that a reduced quantum plane can be viewed as a 
semi-commutative Galois extension of a fractional one-dimensional space and we 
apply the noncommutative differential calculus developed in the previous sections 
to areduced quantum plane. 


Keywords Noncommutative differential calculus - Galois extension - Reduced 
quantum plane 
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1 Introduction 


Let us briefly remind a definition of noncommutative Galois extension [12-15]. 
Suppose of is an associative unital C-algebra, </ C &f is its subalgebra, and there is 
an element t € &/ which satisfies t ¢ DA, tN = 1, where N > 2 is an integer and 
1 is the identity element of o/. A noncommutative Galois extension of </ by means 
of t is the smallest subalgebra </[t] C @ such that A C @{[r], and t € Y[r]. It 
should be pointed out that a concept of noncommutative Galois extension can be 
applied not only to associative unital algebra with a binary multiplication law but 
as well as to the algebra with a ternary multiplication law, for instant to a ternary 
analog of Grassmann and Clifford algebra [6, 14, 15], and this approach can be used 
in particle physics to construct an elegant algebraic model for quarks. 

A graded q-differential algebra can be viewed as a generalization of a notion of 
graded differential algebra if we use a more general equation d” = 0, N > 2 than the 
basic equation d? = 0 of a graded differential algebra. This idea was proposed and 
developed within the framework of noncommutative geometry [10], where the author 
introduced the notions of N-complex, generalized cohomologies of N-complex and 
making use of an Nth primitive root of unity constructed an analog of an algebra 
of differential forms in n-dimensional space with exterior differential satisfying the 
relation d’ = 0. Later this idea was developed in the paper [9], where the authors 
introduced and studied a notion of graded q-differential algebra. It was shown [1, 2, 
4, 5] that a notion of graded q-differential algebra can be applied in noncommutative 
geometry in order to construct a noncommutative generalization of differential forms 
and a concept of connection. 

In this paper we will study a special case of noncommutative Galois extension 
which is called a semi-commutative Galois extension. A noncommutative Galois 
extension is referred to as a semi-commutative Galois extension [15] if for any 
element x € & there exists an element x’ € & such that xt = t x’. In this paper we 
show that a semi-commutative Galois extension can be endowed with a structure of 
a graded algebra if we assign degree zero to elements of subalgebra ./ and degree 
one to Tt. This is the first step on a way to construct the graded q-differential algebra 
if we are given a semi-commutative Galois extension. The second step is the theorem 
which states that if there exists an element v of graded associative unital C-algebra 
which satisfies the relation vY = 1 then this algebra can be endowed with the structure 
of graded g-differential algebra. We can apply this theorem to a semi-commutative 
Galois extension because we have an element t with the property t” = 1, and this 
allows us to equip a semi-commutative Galois extension with the structure of graded 
q-differential algebra. Then we study the first and higher order noncommutative 
differential calculus induced by the N-differential of graded q-differential algebra. 
We introduce a derivative and differential with the help of first order noncommutative 
differential calculus developed in the papers [3, 7]. We also study the higher order 
noncommutative differential calculus and in this case we consider a differential d 
as an analog of exterior differential and the elements of higher order differential 
calculus as analogs of differential forms. Finally we apply our calculus to reduced 
quantum plane [8]. 
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2 Graded q-Differential Algebra Structure 
of Noncommutative Galois Extension 


In this section we remind a definition of noncommutative Galois extension, semi- 
commutative Galois extension, and show that given a semi-commutative Galois 
extension we can construct the graded q-differential algebra. 

First of all we remind a notion of a noncommutative Galois extension [12-15]. 


Definition 1 Let / be an associative unital C-algebra and .&/ C of be its subalge- 
bra. If there exist an element t € . and an integer N > 2 such that 


(i) 2 S41, 
(ii) t* ¢ oA for any integer 1 <k <N—1, 


then the smallest subalgebra .e/[t] of of which satisfies 


(iii) A C A[T], 
(iv) tT € P[r], 


is called the noncommutative Galois extension of & by means of Tt. 


In this paper we will study a particular case of a noncommutative Galois extension 
which is called a semi-commutative Galois extension [15]. A noncommutative Galois 
extension is referred to as a semi-commutative Galois extension if for any element 
x € & there exists anelementx’ € </ suchthatx t = 1 x’. We will give this definition 
in terms of left and right .c7-modules generated by t. Let A\[r] and [tT] be 
respectively the left and right ./-modules generated by t. Obviously we have 


A'[t] C At], Alt] C Ar]. 


Definition 2 A noncommutative Galois extension </[t] is said to be a right 
(left) semi-commutative Galois extension if %![t] C A\[r] (A! [tr] Cc @}[r]). If 
Bet] = oh [t] then a noncommutative Galois extension will be referred to as a 
semi-commutative Galois extension, and in this case W'[t] = &}[t] = A\[r] is 
the .</-bimodule. 


It is well known that a bimodule over an associative unital algebra ./ freely 
generated by elements of its basis induces the endomorphism from an algebra ./ 
to the algebra of square matrices over .e/. In the case of semi-commutative Galois 
extension we have only one generator t and it induces the endomorphism of an 
algebra .o/. Indeed let /[t] be a semi-commutative Galois extension and ./'[r] 
be its </-bimodule generated by [t]. Any element of the right </-module [rT] 
can be written as t x, where x € .%. On the other hand .e[t] is a semi-commutative 
Galois extension which means Sa [tT] = oh} [t], and hence each element x t of the 
left </-module can be expressed as t @,(x), where $,(x) € &. It is easy to verify 


16 V. Abramov and Md. Raknuzzaman 


that the linear mapping ¢ : x > @,(x) is the endomorphism of subalgebra 7%, 1.e. 
for any elements x, y € 2 we have ¢, (xy) = ¢;(x)@;(y). This endomorphism will 
play an important role in our differential calculus, and in what follows we will also 
use the notation @, (x) = x,. Thus 


UT=TO-A(X), UT=TU,. 


It is clear that 
on =idy, uy =U, 


N 


because for any u € & it holds ut’ = 1% #%(w) and taking into account that 


tN = 1 we get OY (u) =u. 


Proposition 1 Let </[t] be a semi-commutative Galois extension of & by means 
of t, and Altap A(t] be respectively the left and right /-modules generated by 
t*, wherek =1,2,...,N—1. Then A} [tr] = A*[t] = oA*[r] is the L-bimodule, 
and 


At) = OU [1] = M1110 M11 @---@ AN" [r], 
where @°[t] = &. 


Evidently the endomorphism of .&/ induced by the ./-bimodule structure of A‘[t] 
is p*, where @ : </ —> o& is the endomorphism induced by the ./-bimodule /![r]. 
We will also use the notation p(x) = xx. 

It follows from Proposition | that a semi-commutative Galois extension </[t] has 
a natural Z,-graded structure which can be defined as follows: we assign degree zero 
to each element of subalgebra 7, degree 1 to t and extend this graded structure to 
a semi-commutative Galois extension </[t] by determining the degree of a product 
of two elements as the sum of degree of its factors. The degree of a homogeneous 
element of </[t] will be denoted by | |. Hence |u| = 0 for any u € & and |t| = 1. 

Now our aim is to show that given a noncommutative Galois extension we can 
construct a graded q-differential algebra, where g is a primitive Nth root of unity. 
First of all we remind some basic notions, structures and theorems of theory of graded 
q-differential algebras. 

Let S = BkeZy L* =D @A'®--.-@ AN beaZy-graded associative uni- 
tal C-algebra with identity element denoted by 1. Obviously the subspace ./° of 
elements of degree 0 is the subalgebra of a graded algebra .7. Every subspace .o/* 
of homogeneous elements of degree k > 0 can be viewed as the .o/°-bimodule. The 
graded g-commutator of two homogeneous elements u, v € & is defined by 


Lv, u]g =vu-— qhitluy. 
A graded q-derivation of degree m of a graded algebra © is a linear mapping 


d:.& —> of of degree m,i.e.d : a! — /'*™, which satisfies the graded g-Leibniz 
rule 
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d(uv) = d(u)v + q™ud(v), (1) 


where u is a homogeneous element of degree /, i.e. u € .'. A graded q-derivation d 
of degree m is called an inner graded q-derivation of degree m induced by an element 
ve SA" if 

d(u) = [v, ulg = vu— guy, (2) 


where u € of’, 
Now let qg be a primitive Nth root of unity, for instant g = e"'/". Then 


g’ =1, 1t+qt---+q" 1 =0. 


A graded q-differential algebra is a graded associative unital algebra </ endowed 
with a graded q-derivation d of degree one which satisfies d’ = 0. In what follows 
a graded q-derivation d of a graded q-differential algebra </ will be referred to as 
a graded N-differential. Thus a graded N-differential d of a graded q-differential 
algebra is a linear mapping of degree one which satisfies a graded q-Leibniz rule 
and d% = 0. It is useful to remind that a graded differential algebra is a graded 
associative unital algebra equipped with a differential d which satisfies the graded 
Leibniz rule and d? = 0. Hence it is easy to see that a graded differential algebra 
is a particular case of a graded q-differential algebra when N = 2, g = —1, and in 
this sense we can consider a graded q-differential algebra as a generalization of a 
concept of graded differential algebra. Given a graded associative algebra .</ we can 
consider the vector space of inner graded q-derivations of degree one of this algebra 
and put the question: under what conditions an inner graded q-derivation of degree 
one is a graded N-differential? The following theorem gives answer to this question. 


Theorem 1 Let & be a Zy-graded associative unital C-algebra and d(u) = [v, ug 
be its inner graded q-derivation induced by an element v € &'. The inner graded 
q-derivation d is the N-differential, i.e. it satisfies dN = 0, if and only if VY = +1. 


Now our goal is apply this theorem to a semi-commutative Galois extension to 
construct a graded q-differential algebra with N-differential satisfying d’ = 0. 


Proposition 2 Let q be a primitive Nth root of unity. A semi-commutative Galois 
extension &[t], equipped with the Zy-graded structure described above and with 
the inner graded q-derivation d = [t, |q induced by t, is the graded q-differential 
algebra, and d is its N-differential. For any element & of semi-commutative Galois 
extension /[t] written as a sum of elements of right o/-modules A(t} 


N-1 
= otk slut tutte tl uy, mE, 
k=0 
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it holds 
N-1 


dé = Soc (un — gt (ue)e), (3) 


k=0 


where uz — (ug)z is the endomorphism of & induced by the bimodule structure of 


A(t]. 


3 First Order Differential Calculus over Associative Unital 
Algebra 


In this section we describe a first order differential calculus over associative unital 
algebra [7]. If an associative unital algebra is generated by a family of variables, 
which obey commutation relations, then one can construct a coordinate first order 
differential calculus over this algebra. A coordinate first differential calculus induces 
the partial derivatives with respect to generators of algebra and these partial deriva- 
tives satisfy the twisted Leibniz rule. 

A first order differential calculus is a triple (.c7, @, d) where .& is an associative 
unital algebra, W@ is an .&/-bimodule, and d, which is called a differential of first order 
differential calculus, is a linear mapping d: < — .@ satisfying the Leibniz rule 
d(fh) = dfh + fdh, where f,h € &. A first order differential calculus (7, @, d) 
is referred to as a coordinate first order differential calculus if an algebra © is 
generated by the variables x!, x”, ...,x” which satisfy the commutation relations, 
and an </-bimodule .@, considered as a right ./-module, is freely generated by 
dx!', dx*,..., dx". It is worth to mention that a first order differential calculus was 
developed within the framework of noncommutative geometry, and an algebra ./ 
is usually considered as the algebra of functions of a noncommutative space, the 
generators x!, x”, ..., x” of this algebra are usually interpreted as coordinates of this 
noncommutative space, and an .-bimodule -@ plays the role of space of differential 
forms of degree one. In this paper we will use the corresponding terminology in order 
to stress a relation with noncommutative geometry. 

Let us consider a structure of coordinate first order differential calculus. This 
differential calculus induces the differentials dx!, dx?,..., dx" of the generators 
x!,x?,...,x". Evidently dx!, dx’,...,dx" € @. M is a bimodule, ie. it has a 
structure of left </-module and right -module. Hence for any two elements 
f,he Panda € 4 itholds (fw)h = f (wh). According to the definition of a coor- 
dinate first order differential calculus the right /-module -# is freely generated 
by the differentials of generators dx!, dx?,..., dx". Thus for any w € .@ we have 
w = dx'f, + dx*fy +... + dxf, where fi, fo, ...,fn € @. A coordinate first order 
differential calculus (.&%, .@, d) is an algebraic structure, which extends to noncom- 
mutative case the classical differential structure of a manifold. From the point of view 
of noncommutative geometry .< can be viewed as an algebra of smooth functions, 
d is the exterior differential, and .@ is the bimodule of differential 1-forms. In order 
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to stress this analogy we will call the elements of algebra ./ “functions” and the 
elements of </-bimodule .@ “1-forms”. 

Because .@ is ./-bimodule, for any function f € ./ we have two products f dx! 
and dx'f. Since dx!, dx*,..., dx" is the basis for the right </-module .@, each 
element of .@ can be expressed as linear combination of dx', dx*,..., dx" multiplied 
by the functions from the right. Hence the element fdx' € .@ can be expressed in 
this way, i.e. 


fe! = de'r|(f) + deri f) + Fah f) = deri(f), - 


where ri (f), r(f),...,ri(f) € @ are the functions. Making use of these functions 
we can compose the square matrix 


ny AG) ++: A) 
Raye 2 < ss 
Hp) Rf) (fh) 


It is worth to point out that an entry ri( Ff) stands on intersection of i-th column 
and j-th row. This square matrix determines the mapping R : </ — Mat,(./) where 
Mat, (./) is the algebra of n order square matrices over an algebra ./. It can be 
proved 


Proposition 3 R: & — Mat,() is the homomorphism of algebras. 


Proof We need to prove that for any f, g € & it holds R(fg) = R(f)R(g). Now 
according to the Eq. (4) we have 


(fg)dx! = dw/ri(fa). 
The left hand side of the above relation can be written as 
f(gdx') = f (dx'ri(g)) = (fae! )ri(g) = dar (f))ri(g) = dx! (HL (P)rj(g)). 
Now we can write 


dx' ri (fg) = dx (ry, f)ri(g)) => rif) = Arig). 


or in matrix form R(fg) = R(f)R(g), which ends the proof. 


Let <,.@,d be a coordinate first order differential calculus such that right 
&-module -@ is a finite freely generated by the differentials of coordinates {dx;}7_,. 
The mappings 0, : & > o&, wherek € {1,2,...,}, uniquely defined by 


df =dx* &(f), fea, (5) 
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are called the right partial derivatives of a coordinate first order differential calculus. 
It can be proved 


Proposition 4 If </,.@, d is a coordinate first order differential calculus over an 
algebra & such that M is a finite freely generated right </-module with a basis 
{dx;}"_, then the right partial derivatives 0, : & — A&A of this differential calculus 
satisfy 

Ox (fg) = Ox(f) g + rf); 9i(g). (6) 


The property (6) is called the twisted (with homomorphism R) Leibniz rule for 
partial derivatives. 

If & is a graded q-differential algebra with differential d then evidently the sub- 
space of elements of degree zero .o/” is the subalgebra of .o/, the subspace of elements 
of degree one /! is the °-bimodule, a differential d : </° — ./' satisfies the Leib- 
niz rule. Consequently we have the first order differential calculus (7°, .o/', d) of a 
graded q-differential algebra ./. If <7 is generated by some set of variables then we 
can construct a coordinate first order differential calculus with corresponding right 
partial derivatives. 


4 First Order Differential Calculus of Semi-commutative 
Galois Extension 


Itis shown in Sect. 2 that given a semi-commutative Galois extension we can construct 
a graded q-differential algebra. In the previous section we described the structure of 
a coordinate first order differential calculus over an associative unital algebra, and 
at the end of this section we also mentioned that the subspaces ./°, .o/! of a graded 
q-differential algebra together with differential d of this algebra can be viewed as a 
first order differential calculus over ./°. In this section we apply an approach of first 
order differential calculus to a graded q-differential algebra of a semi-commutative 
Galois extension. 

Let [rt] be a semi-commutative Galois extension of an algebra .</ by means 
of t. Thus we have an algebra .& and ./-bimodule /'[r]. Next we have the N- 
differential d : &[t] — [tr] induced by tT, and if we restrict this N-differential 
to the subalgebra ./ of Galois extension ./[t] then d : <&/ > /'[r] satisfies the 
Leibniz rule. Consequently we have the first order differential calculus which can 
be written as the triple (.e, d, </'[r]). In order to describe the structure of this first 
order differential calculus we will need the vector space endomorphism A : & > & 
defined by 

Au=u-—u,, uEee. 


For any elements u, v € & this endomorphism satisfies 


A(uyv) = A(u)v + u, Av). 
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Let us assume that there exists an element x € © such that the element Ax € & 
is invertible, and the inverse element will be denoted by Ax~!. The differential dx 
of an element x can be written in the form dx = t Ax which clearly shows that dx 
has degree one, i.e. dx € &/'[T], and hence dx can be used as generator for the right 
&-module /'[t]. Let us denote by dx 1 u > ax(u) = Uay the endomorphism of 
& induced by bimodule structure of </'[t] in the basis dx. Then 


Udy = Ax! u, Ax = Ada, Ur. (7) 


Definition 3 For any element u € & we define the right derivative te € & (with 
respect to x) by the formula 


Pe a (8) 
u= oa 


Analogously one can define the left derivative with respect to x by means of the 
left </-module structure of ./'[1]. Further we will only use the right derivative which 
will be referred to as the derivative and often will be denoted by u’.. Thus we have 
the linear mapping 

d 


dx 


:A > A, g ube ul. 
dx 


Proposition 5 For any element u € & we have 


d 
Oe et Ag (9) 
dx 
The derivative (8) satisfies the twisted Leibniz rule, i.e. for any two elements 
u,v € & it holds 


d du dv du dv 
ae = + hax (Uu) er + Ada x Ur ae 
We have constructed the first order differential calculus with one variable x, and it 
is natural to study a transformation rule of the derivative of this calculus if we choose 
another variable. From the point of view of differential geometry we will study a 
change of coordinate in one dimensional space. Let y € & be an element of such 
that A y = y — y, is invertible. 


Proposition 6 Let x, y be elements of & such that Ax, Ay are invertible elements 
of @. Then 


ON 


d d d 
—=y,—, dx=dyx, —=x 


d 
dy = dx y., cn 
VS ON Yar Oy dy dy dx 


-1 
where x, = (,)- 
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Indeed we have dy = t Ay, dx = t Ax. Hence t = dx Ax! and 
dy = dx (Ax! Ay) = dxyi.. 
If wis any element of .&/ the for the derivatives we have 


du = -1 = , du 
—=Ax  Au=(Ax Ay)(Ay  Au)=y, —. 
dx dy 


As an example of the structure of graded g-differential algebra induced by d,; ona 
semi-commutative Galois extension we can consider the quaternion algebra H. The 
quaternion algebra H is associative unital algebra generated over R by i, j, k which 
are subjected to the relations 


Pa=paHkR=-1, ij=—jisk, jk =—kj =i, ki=—ik=j, 
where 1 is the unity element of H. Given a quaternion 
q=al+qi+amj+ak 


we can write it in the form q = (dp 1 + ado j) +i (a, + a3 7). Hence if we consider the 
coefficients of the previous expression zo = do 1 + a2j, Zz) = a; + a3j as complex 
numbers then q = Zo 1 + iz; which clearly shows that the quaternion algebra H can 
be viewed as the semi-commutative Galois extension C[i]. Evidently in this case we 
have N = 2, q = —1, and Z)-graded structure defined by |1| = 0, |i] = 1. Hence we 
can use the terminology of superalgebras. It is easy to see that the subspace of odd 
elements (degree 1) can be considered as the bimodule over the subalgebra of even 
elements a 1 + bj and this bimodule induces the endomorphism ¢ : C — C, where 
(z) = z. Let d be the differential of degree one (odd degree operator) induced by i. 
Then making use of (3) for any quaternion q we have 


dq=d(z9l+iz) =—-(@+z) 1. 


Obviously d*q = 0. 


5 Higher Order Differential Calculus 
of Semi-commutative Galois Extension 


Our aim in this section is to develop a higher order differential calculus of a 
semi-commutative Galois extension </[t]. This higher order differential calculus 
is induced by the graded q-differential algebra structure. In Sect.2 it is mentioned 
that a graded q-differential algebra can be viewed as a generalization of a concept of 
graded differential algebra if we take N = 2, g = —1. It is well known that one of the 
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most important realizations of graded differential algebra is the algebra of differen- 
tial forms on a smooth manifold. Hence we can consider the elements of the graded 
q-differential algebra constructed by means of a semi-commutative Galois extension 
f(t] and expressed in terms of differential dx as noncommutative analogs of differ- 
ential forms with exterior differential d which satisfies dY = 0. In order to stress this 
analogy we will consider an element x € © as analog of coordinate, the elements of 
degree zero as analogs of functions, elements of degree k as analogs of k-forms, and 
we will use the corresponding terminology. It should be pointed out that because of 
the equation d’ = 0 there are higher order differentials dx, d*x,..., d~'x in this 
algebra of differential forms. 

Before we describe the structure of higher order differentials forms it is useful to 
introduce the polynomials P;(x), Qg(x), where k = 1,2,..., N. Let us remind that 
Ax = x —x, € &. Applying the endomorphism t we can generate the sequence of 
elements 


Ax, =X, — X72, Ax,2 = X72 —X7z3,..., AX~w-1 = X~Ww-1 — X. 


Obviously each element of this sequence is invertible. Now we define the sequence 
of polynomials Q; (x), Q2(x), ..., Qn(x), where 


Ox (x) = Axzr-1 AXzk-2 «2. Ax, AX. 
These polynomials can be defined by means of the recurrent relation 
Ox+1(*) = (Qk(x))r Ax. 


It should be mentioned that Q; (x) is the invertible element and 


(OnGDY = Ax Ane cadens 
We define the sequence of elements P; (x), P2(x),..., Py(x) € & by the recurrent 
formula 


Pin =Pi@)—¢ (Pi), k=1,2,...,.N—-1, 


and P;(x) = Ax. Clearly P| (x) = Q,x) and for the k = 2, 3 a straightforward calcu- 
lation gives 


Pox) =x-U+q)x+ 9x2, 
P3(x) =x-(t qt @) art qt P+) x2 — Gx. 


Proposition 7 [fq is a primitive Nth root of unity then there are the identities 


Py—\(x) + (Py-1@0))¢ +++ + (Py-1@)) rev = 0, 9 Py (x) = 0. 


24 V. Abramov and Md. Raknuzzaman 


Now we will describe the structure of higher order differential forms. It follows 
from the previous section that any 1-form a, i.e. an element of < | [t], can be written 
in the form m = dx u, where u € &. Evidently d: # > &'[t], dw = dxu'.. The 
elements of [rt] will be referred to as 2-forms. In this case there are two choices 
for a basis for the right </-module & ?[t]. We can take either t or (dx)* as a basis 
for <[t]. Indeed we have 

(dx)” = t° Qo(x). 


It is worth mentioning that the second order differential d*x can be used as the 
basis for <7[T] only in the case when P(x) is invertible. Indeed we have 


dx =" P(x), d?x = (dx)? OF! (x)Pr(x). 


If we choose (dx)? as the basis for the module of 2-forms .</7[t] then any 2-form 
@ can be written as w = (dx)? u, where u € &. Now the differential of any 1-form 
@ = dx u, where u € &, can be expressed as follows 


daw = (dx)? (qul, + OF! (x)P2(x) u). (10) 


It should be pointed out that the second factor of the right-hand side of the above 
formula resembles a covariant derivative in classical differential geometry. Hence 
we can introduce the linear operator D : & — & by the formula 


Du=qu.+Q5'(x)Po(x)u, ue. (11) 
If w = dv, v € &, i.e. w is an exact form, then 
dw = d’v = (dx)? Dv, = (dx)? (qvi + OF! (x) P2(x) vi). 
If we consider the simplest case N = 2, g = —1 then 
d’v=0, P(x) =0, (dx)? £0, 
and from the above formula it follows that v’ = 0. 


Proposition 8 Let &[t] be a semi-commutative Galois extension of algebra & by 
means of t, which satisfies t” = 1, and d be the differential of the graded differential 
algebra induced by an element t as it is shown in Proposition 2. Let x € & be an 
element such that Ax is invertible. Then for any element u € & it holds u’, = 0, 
where u’, is the derivative (8) induced by d. Hence any element of an algebra & is 
linear with respect to x. 


The quaternions considered as the noncommutative Galois extension of complex 
numbers (Sect.3) provides a simple example for the above proposition. Indeed in 
this case tT = i, & = C, where the imaginary unit is identified withj, (al + bj); = 
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al — bj. Hence we can choose x = a1+ bj iff b £ 0. Indeed in this case Ax = 
X—x,=al+bj—al+bj=2bj, and Ax is invertible iff b # 0. Now any z = 
cl+dj € & can be uniquely written in the form z = ¢1+4 dx iff 


=b£0. 


la 
Ob 


Thus any z € & is linear with respect to x. 

Now we will describe the structure of module of k-forms o/*[t]. We choose 
(dx)* as the basis for the right ./-module / kT], then any k-form w can be written 
w = (dx) u, u € &. We have the following relations 


(dx) = t* Ox(x),  d*x = t* P; (x). 


In order to get a formula for the exterior differential of a k-form w we need 
the polynomials © (x), ®2(x), ..., ®y_1(x) which can be defined by the recurrent 
relation 

Pysi(x) = Adax(Op) + gO), k= 1,2,...,N—1, (12) 


where ®)(x) = Q, "(x)Po(x). These polynomials satisfy the relations d(dx)* = 
(dx)*+!@, (x) and given a k-form w = (dx) u, u € & we find its exterior differ- 
ential as 


dw = (dx)**! (« ul. + B(x) ‘) = (dx)! Dy, 


The linear operator D : of > o&,k = 1,2,...,N — 1 introduced in the previ- 
ous formula has the form 
Du = g¥ ul. + G(x) u, (13) 


and, as it was mentioned before, this operator resembles a covariant derivative of 
classical differential geometry. It is easy to see that the operator (11) is the particular 
case of (13), i.e. D = D. 


6 Semi-commutative Galois Extension Approach 
to Reduced Quantum Plane 


In this section we show that a reduced quantum plane can be considered as a semi- 
commutative Galois extension. We study a first order and higher order differential 
calculus of a semi-commutative Galois extension in the particular case of a reduced 
quantum plane. 
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Let x, y be two variables which obey the commutation relation 
XY=4 YX, (14) 


where q 4 0, | is a complex number. These two variables generate the algebra of 
polynomials over the complex numbers. This algebra is an associative algebra of 
polynomials over C and the identity element of this algebra will be denoted by 1. 
In noncommutative geometry and theoretical physics a polynomial of this algebra 
is interpreted as a function of a quantum plane with two noncommuting coordinate 
functions x, y and the algebra of polynomials is interpreted as the algebra of (poly- 
nomial) functions of a quantum plane. If we fix an integer N > 2 and impose the 
additional condition 

xN=yY =1, (15) 


then a quantum plane is referred to as a reduced quantum plane and this polynomial 
algebra will be denoted by %[x, y]. 

Let us mention that from an algebraic point of view an algebra of functions 
on a reduced quantum plane may be identified with the generalized Clifford alge- 
bra iN with two generators x, y. Indeed a generalized Clifford algebra is an asso- 
ciative unital algebra generated by variables x), x2,..., x, obeying the relations 
xix; = gS89-x;x;, x/¥ = 1, where sg is the sign function. 

It is well known that the generalized Clifford algebras have matrix representations, 
and, in the particular case of the algebra .%,[x, y], the generators of this algebra x, y 
can be identified with the square matrices of order N 


16 On. 0 62:0;,3200 
Oe Mise. 0 0 001...00 
00.9 22. 0 0 000...00 
pH os. 3 e P= lesan «ald (16) 
00 Ou. 0 000.501 
00 O > @¢? 100 ..2.00 


where q is a primitive Nth root of unity. As the matrices (16) generate the algebra 
Mat, (C) of square matrices of order N we can identify the algebra of functions on 
a reduced quantum plane with the algebra of matrices Maty(C). 

The set of monomials B = {1, y, x, x, yx, y*,..., y¥x’,..., y¥—1x"—1} can be 
taken as the basis for the vector space of the algebra .%[x, y]. We can endow this 
vector space with an Zy-graded structure if we assign degree zero to the identity 
element 1 and variable x and we assign degree one to the variable y. As usual we 
define the degree of a product of two variables x, y as the sum of degrees of factors. 
Then a polynomial 


N-1 
w= Di byix!, prec, (17) 
1=0 
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will be ahomogeneous polynomial with degree k. Let us denote the degree of a homo- 
geneous polynomial w by |w| and the subspace of the homogeneous polynomials of 
degree k by af [x, y]. It is obvious that 


y\x, y] = A) [x,y] ® A} [x,y] ® ++ ® AN" [x,y]. (18) 


In particular a polynomial r of degree zero can be written as follows 
N-1 
r=)> px’, ppeC, re Ix,y]. (19) 
I=0 


Obviously the subspace of elements of degree zero ay [x, y] is the subalgebra of 
(x, y] generated by the variable x. Evidently the polynomial algebra .%[x, y] of 
polynomials of a reduced quantum plane can be considered as a semi-commutative 
Galois extension of the subalgebra By Ix, y] by means of the element y which sat- 
isfies the relation yY = 1. The commutation relation x y = q yx gives us a semi- 
commutativity of this extension. 

Now we can endow the polynomial algebra ./,[x, y] with an N-differential d. 
Making use of Theorem | we define the N-differential by the following formula 


|w| 


dw =[y,Wlg=yw-q" wy, (20) 


where g is a primitive Nth root of unity and w € %,[x, y]. Hence the algebra ,[x, y] 
equipped with the N-differential d is a graded q-differential algebra. 

In order to give a differential-geometric interpretation to the graded q-differential 
algebra structure of .%[x, y] induced by the N-differential d, we interpret the com- 
mutative subalgebra ay Lx, y] of the x-polynomials (19) of %[x, y] as an algebra of 
polynomial functions on a one dimensional space with coordinate x. Since af [x, y] 
fork >Oisa a, [x, y]-bimodule we interpret this a, [x, y]-bimodule of the ele- 
ments of degree k as a bimodule of differential forms of degree k and we shall call 
an element of this bimodule a differential k-form on a one dimensional space with 
coordinate x. The N-differential d can be interpreted as an exterior differential. 

It is easy to show that in one dimensional case we have a simple situation when 
every bimodule af [x,y], k > 0 of the differential k-forms is a free right module 
over the commutative algebra of functions ay [x, y]. Indeed if we write a differential 
k-form w as follows 


N-1 
w= yk > Ba’ = yk i. r= > be E ){x, yl, (21) 


l=0 /=0 


and take into account that the polynomial r = (y“)~!w = y"~‘w is uniquely deter- 
mined then we can conclude that af [x, y] is a free right module over ay [x, y] 
generated by y*. 
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As it was mentioned before a bimodule structure of a free right module over an 
algebra B generated freely by p generators is uniquely determined by the homomor- 
phism from an algebra 6 to the algebra of (p x p)-matrices over B. In the case of a 
reduced quantum plane every right module af [x, y] is freely generated by one gen- 
erator (for instant we can take y* as a generator of this module). Thus its bimodule 
structure induces an endomorphism of the algebra of functions a, [x, y] and denot- 
ing this endomorphism in the case of the generator y* by Ax : ay [x,y] > By [x, y] 
we get 

r v — yk Ax(r), (no summation over k) (22) 


for any function r € ay [x, y]. Making use of the commutation relations of vari- 
ables x, y we easily find that A, (x) = q* x. Since the algebra of functions By [x, y] 
may be viewed as a bimodule over the same algebra we can consider the func- 
tions as degree zero differential forms, and the corresponding endomorphism is 
the identity mapping of %[x, y], ie. Ao = 7, where I: of) [x,y] > off Ix, y] is 
the identity mapping. Thus the bimodule structures of the free right modules 
Ay Ix, yl, %}[x, y],..., LY! Lx, y] of differential forms induce the associated endo- 
morphisms Ag, Aj, ..., Ay—; of the algebra a, [x, y]. It is easy to see that for any k 
it holds A, = A‘. 

Let us start with the first order differential calculus (A Ix, y], A} Lx, y], d) 
over the algebra of functions Ay Ix, y] induced by the N-differential d, where 
d: Ay Ix, y]—> 2} [x, y] and 2, [x,y] is the bimodule over By Ix, y]. For any 
we By Ix, y] we have 


dw = yw — wy = yw — yAi(w) = y(w — Ai(w)) = y Ag), (23) 


where A, =1— A): By (x, y] => Ay x, y]. It is easy to verify that for any two 
functions w, w’ € A) [x, y] the mapping A, has the following properties 


A,(ww’) = A,g(w)w' + Ai (w) A, (w’), (24) 
Ag(x*) = 1 — gyiklg x*. (25) 


Particularly dx = yA, (x), and this formula shows that dx can be taken as a gen- 
erator for the free right module a} [x, y]. 

Since the bimodule a} [x, y] of the first order differential calculus (A) Ix, yl, 
A} [x, y], d) is a free right module we have a coordinate first order differential calcu- 
lus over the algebra a [x, y], and in the case of a calculus of this kind the differen- 
tial induces the derivative 0 : B; [x,y] > ay [x, y] which is defined by the formula 


dw = dx dw, VWwe Ap (x, y]. Using this definition we find that for any function w 
it holds 
dw = (1—q)'x*1A,(w). (26) 
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From this formula and (24), (25) it follows that this derivative satisfies the twisted 
Leibniz rule 
d(ww’) = 0(w)- vw’ + Ay (w) - O(W’), (27) 


and 
Ax* = [k]q x*!. (28) 


Let us study the structure of the higher order exterior calculus on a reduced quan- 
tum plane or, by other words, the structure of the bimodule af [x, y] of differential 
k-forms, when k > 1. In this case we have a choice for the generator of the free right 
module. Indeed since the kth power of the exterior differential d is not equal to zero 
when k < N, i.e. d* 4 0 fork < N,a differential k-form w may be expressed either 
by means of (dx)* or by means of d‘x. Straightforward calculation shows that we 
have the following relation between these generators 


[k]q 


dx = = (axy* x". (29) 


We will use the generator (dx)* of the free right module af [x, y] as a basis in 
our calculations with differential k-forms. For any differential k-form w € af [x, y] 
we have dw € c haba 2 y]. Let us express these two differential forms in terms of 
the generators of the modules af [x, y] and a Lx, y]. We have w = (dx) r, dw = 
(dx)*+! +, where r,7 € a, [x, y] are the functions. Making use of the definition of 
the exterior differential d we calculate the relation between the functions r, 7 which 
is 

7 = (Agx) '(q*r— 4A), (30) 


where A, is the endomorphism of the algebra of functions By Ix, y]. This relation 
shows that the exterior differential d considered in the case of the differential k-forms 
induces the mapping a : A) [x,y] > A) [x, y] of the algebra of the function which 
is defined by the formula 


dita AY @); (31) 
where 
w = (dx) r. (32) 
It is obvious that 
A (r) = (Agx)'(q*r — g®Ai()). (33) 


It is obvious that for k = 0 the mapping A coincides with the derivative induced 
by the differential d in the first order calculus, i.e. 
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AO (r) = Or = (Agx)"(r — Ay(7)). (34) 


The higher order mappings Ay, which we do not have in the case of a classical 
exterior calculus on a one dimensional space, have the derivation type property 


AM (rr!) = APO) + GF Ar) AMY), ©) 


where k = 0,1,2,...,N—1. A higher order mapping ay can be expressed in 
terms of the derivative 0 as a differential operator on the algebra of functions as 
follows 


-k ok 
A® = gka + as (36) 


Thus we see that exterior calculus on a one dimensional space with coordinate x 
satisfying x” = 1 generated by the exterior differential d satisfying dY = 0 has the 
differential forms of higher order which are not presented in the case of a classical 
exterior calculus with d? = 0. The formula for the exterior differential of differential 
forms can be defined by means of contains not an a derivative which satisfies the 
twisted Leibniz rule (36). 
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Valued Custom Skew Fields with Generalised 
PBW Property from Power Series 
Construction 


Lars Hellstrém 


Abstract This chapter describes a construction of associative algebras that, despite 
starting from a commutation relation that the user may customize quite extensively, 
still manages to produce algebras with a number of useful properties: they have a 
Poincaré—Birkhoff—Witt type basis, they are equipped with a norm (actually an ultra- 
norm) that is trivial to compute for basis elements, they are topologically complete, 
and they satisfy their given commutation relation. In addition, parameters can be 
chosen so that the algebras will in fact turn out to be skew fields and the norms 
become valuations. The construction is basically that of a power series algebra with 
given commutation relation, stated to be effective enough that the other properties 
can be derived. What is worked out in detail here is the case of algebras with two 
generators, but only the analysis of the commutation relation is specific for that case. 


Keywords Diamond Lemma - Commutation relation - Skew field construction - 
Ultranorm - Valuation - Irrational weighting of variables 


1 Introduction 


Power series is one of those concepts which can turn out to be very different things 
in different branches of mathematics. In algebra, power series is one of many con- 
structions of new rings from old ones; depending on one’s point of view, the results 
may be anywhere from exciting to rather trivial. A combinatorialist regards a power 
series mostly as a fancy way to present a sequence, which none the less is quite 
useful since it comes with a host of dirty tricks that boil down to bold applications of 
elementary algebra. Pre-modern calculus used power series all over the place, mixing 
spectacular successes with equally spectacular failures that eventually earned them 
a bad reputation. But in modern analysis, which was born out of the need to put 
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calculus on a rigorous foundation, the power series is just a special case of series: it 
only means something if it converges, and the ultimate judge of convergence is the 
(point-set) topology. 

My need for the power series considered below arose in the context [3, pp. 100- 
101] of looking for commuting homogeneous elements in a g-deformed Heisenberg— 
Weyl algebra; concretely that algebra had two generators A and B satisfying the 
commutation relation AB — gBA = 1 for some nonzero scalar g, and the question 
was when two elements on the form 


min{k,/} 


» nBe alt 


i=0 


(for different values of k, /, and scalars r;) would commute with each other; the 
product of two such homogeneous elements is again a homogeneous element, and 
arbitrary algebra elements can be written as finite sums of homogeneous elements. 
It turns out that there is a simple necessary condition in terms of the exponents in the 
leading terms, and that when this condition is met and one homogeneous element is 
given, the problem of determining the scalars in the other element is a straightforward 
linear equation system with what is essentially a lower triangular matrix. The system 
is however overdetermined— after getting to the equation that determines the last 
scalar rminix,1}, there remained a couple of equations that needed to be satisfied, which 
they sometimes were and at other times were not; there did not seem to be a simple 
condition that could determine beforehand in which case one would end up. But what 
if there were no last scalar? If there in each new equation is also a new 7; to absorb 
whatever remains after having substituted known values of all r; with j < i, then 
the system will always have a solution and the known necessary condition becomes 
sufficient! This would however mean looking for a homogeneous element on the 
form 0°97; B‘‘A', which is not something that can be found in the original 
algebra. Considering negative powers of the generators A and B may seem odd, but 
is actually not unheard of in the literature on this problem. Making the sum infinite 
is another matter: the proposed form is that of some kind of Laurent series — in two 
noncommuting variables! Does that even exist? 

In Sweden, 20th century mathematics was very much dominated by analysis, and 
the shape that the following construction took is in a way a consequence of this: 
anything that looked like an infinite sum had to be rigorously justified, and the one 
true framework was that of analysis! Or so I believed, as a Ph.d. student; I have 
subsequently learnt of other ways, mathematically no less rigorous, in which that 
initial goal could have been achieved, but this very analytically flavoured approach 
to noncommutative power series turned out to have some unexpected advantages. In 
particular, several additional properties of the constructed object— some of which 
were called for in the motivating problem about commuting elements, whereas others 
were unexpected discoveries— follow with little extra effort once the foundation 
has been laid. The following result provides a nice sample of what can be had. 
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Theorem 1 Consider the commutation relation 


Mm; 


AB qBA=>-n [BM a" (1) 


i=l j=l 


where n is a positive integer, {m;}_, C Zo, the coefficients q € 0 and {r;}"_, are 
min 


scalars taken from some field , and the exponents {k;;, lig} 2 iim1 C Zare arbitrary. 


If there exists a straight line in R? such that the point (1, 1) is on one side of 


the line and all points (ae kij, peal li) fori =1,...,n are on the other, then 


there exists an R-algebra A, a function a ++ |la|| : A —> R, two distinct elements 
A, B € A, and two constants a, B € R such that: 


1. The commutation relation (1) holds in A. 

2. The algebra A is a skew field, i.e., all nonzero elements in A are invertible. 

3. ||-|| ts an ultranorm on A and |la|| ||b|| = ||ab|| for alla, b € A. 

4. Ais complete in the topology induced by |\.-||. 

5. The set {BEAD ez is an orthogonal Hilbert basis for A and | BK A! | = DOtK 
6. Every nonzero a € A has a unique leading term r BK A', i.e., there exist unique 


r € Randk,l € Z such that lla —rB* A! | < |lall. 


It should be pointed out that this theorem does not exhaust the power of the 
construction, but rather provides a sample of the conclusions that can be drawn in the 
more advanced cases. Several variations are possible, such as relaxing the condition 
on the degrees in the right hand side at the price of instead adding conditions on 
the scalar coefficients of those terms. Not all conditions are needed for all of the 
conclusions, although the order in which the various conclusions are established 
comes with a couple of surprises. 

As long as the intent is only to construct the algebra A, itis even possible to proceed 
with only a few twists in addition to those anyway needed to produce some algebra 
with two elements A and B satisfying (1). Recall that the classical construction would 
be to: 


1. Construct the free algebra R(X) where X = {a, a, b, b} is a set of four formal 
variables. Variables a and b will give rise to the named elements A and B, whereas 
a and b are used to ensure that these have multiplicative inverses. 
This free associative algebra R(X)— the algebra of “noncommutative poly- 
nomials” on X over R— is in the literature also known as the tensor algebra 
T (Spang (X )) , but that would be a far more awkward way of looking at it, con- 
sidering what lies ahead. 

2. Quotient R(X) by the two-sided ideal J generated by the five elements 


ab — qba— >°7;| [ bal’, aa—1, aa—1, bb—1, bb—1. 
1 


i=l j= 
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Then A=at+dER(X X)/9 and B=b+JeER(X X) /d trivially satisfy (1). The 
quotient R(X X) /J i is however typically nowhere near satisfying the other claims 
of Theorem |. 


The power series construction lengthens the above to: 


1. Construct the free algebra R(X) where X = {a, a, b, b} is a set of four formal 
variables (as before). 

2. Let a, 8,y <€R_ be constants such that mie ER\Q, a+f6>y, and 
Bei kij +o bij < y for alli =1,...,; this means Bk + al = y is 
the equation of one such line in the k/- lanes as was required. Let R be normed 
by the trivial norm (3). Define v: X —> R by v(a) = —v(a) = @ and v(b) = 
—v(b) = B, and let ||-|| be the v-degree norm on R(X) (see Definition 3). 

3. Construct the topological completion R(X) of R(X) with respect to the norm ||-|[. 
This completion is an R-algebra containing R(X), and in particular containing 
the elements a, a, b, and b. 

4. Quotient R(X) by the topological closure J of the two-sided ideal in R(X) that 
is generated by the five elements 


mj 


ab — gba — Sn] 1,4a—1,bb-—1,bb-1. (2) 
i=l j=l 


Then A=a+Je R(X R(X) /J and B=b+JER(X R(X) /J trivially satisfy (1), and 
less trivially also the other claims of Theorem 1. 


In short, the extra steps are to construct a norm on R(X), to form the completion 
R(X), and to remember to take the closure of the ideal before forming the quotient. 

The rest of this chapter is essentially a long proof of Theorem 1, with numerous 
interspersed definitions of concepts that become relevant and (often informal) dis- 
cussions of techniques that are employed. Section 2 introduces the analysis-inspired 
foundations for this power series algebra construction. Section3 employs the Dia- 
mond Lemma for power series algebras to analyze the result, which in particular 
exhibits a basis of the quotient. The final Sect. 4 completes the proof, and goes on to 
sketch some generalizations of the argument. 


2 Normed Algebras 


The material in this section is essentially standard (even if it may be hard to find 
a Mathematics Subject Classification covering this body of knowledge). Therefore 
focus is primarily on giving full definitions for easy reference and secondarily on 
pointing out important features of the concepts defined. Proofs are mostly left as 
exercises to the reader, but the curious may find them in [4, Sect. 2.2—2.3]. 
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Definition 1 Let ® be a ring and let ||-|| be a function from ® to R. Then & is said 
to be a ring with norm ||-|| if the following conditions are satisfied: 


1. |la|| > 0 for alla € &, and ||a|| = 0 if and only if a = 0. 
2. |la — bil < |lal| + ||| for alla, b ER. 
3. |lab|| < |la|| |b] for alla, b € ®. 


If R is a ring with norm ||-||, but the norm is known from the context, then one may 
simply say that ® is a normed ring. If is a ring with norm ||-|| then the function 
||-|| is called the ring norm or simply the norm. 


Condition 2 above is just a more compact combination of two more intuitive 
properties. One is that ||—b|| = ||b|| for all b € 8, since ||—b]| = ||O — b]| < ||O|| + 
\|b|| = ||b||. This property is needed for the corresponding metric o (a, b) = ||a — b|| 
to be symmetric. The other is the normal triangle inequality, which holds since 
lla + bl] = |la — (—b)|| < llall + ]—4]l = llall + |b}. 

Functional analysis provides plenty of examples of normed rings, for example 
as Banach algebras, but those examples are not the ones which are of interest here. 
Instead, the following norm will be frequently used: 


0 ifa=0, 
lal] = (3) 
1 otherwise. 


This norm, which is called the trivial norm, is a ring norm for all rings R. The 
topology it introduces on the ring is not the trivial topology (where only @ and R 
itself are open sets) however, but the discrete topology (all subsets of R are open). 


Definition 2 Let ® be an associative and commutative normed ring with unit, and 
let |-| be the norm on &. Let A be an associative R-algebra. Then A is said to be 
a normed X-algebra if there exists a function ||-||: A —> R, called the norm or 
more precisely R-algebra norm, such that A is a normed ring with ring norm ||-|| 
and 

Ilra|| < |r| |lal| (4) 


for allr € Randa eé A. 

Analogously, an R-module M is said to be a normed R-module if there exists a 
function ||-||: J —> R, called the norm or more precisely R-module norm, such 
that the following conditions are satisfied: 


1. |la|| > 0 for all a € M, and |la|| = 0 if and only if a = 0. 
2. |la — bl < |la|| + ||| for alla, b e M. 
3. |lrall < |r| |la|| for allr € Randa € M. 


It is easily checked that if A is any associative R-algebra, and ||-|| and |-| are the 
trivial ring norms on A and & respectively, then A will be a normed R-algebra with 
norm ||-||. The only normed modules that will be of interest here are normed algebras 
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or submodules of normed algebras, but some of the concepts needed are more natural 
to define for normed modules in general. 

The fact that (4) is an inequality, and not an equality, might seem strange at first. 
It is necessary in the case of a general ring R however, since if 1, r2 € Randa € A 
are nonzero and satisfy rjr2 = 0 then 


0 = [10] = [!0a]] = IIrirzall < Ini Ilr2al] < Iriflral Mall > 0. 


The class of norms that will be most important in this paper are the v-degree 
norms, which are easy to define on the free algebra R(X). By definition every element 
a € R(X) has aunique presentation as asum Dowex ry[t, where X* C R(X) denotes 
the free monoid on X (i.e., the set of monomials in the noncommutative polynomial 
ring R(X), or equivalently the set of all elements in R(X) that are finite products of 
elements of X) and {r,},,<ex« C Rare the coefficients of those monomials; these sums 
are furthermore finite, in the sense that r,, = 0 for all but a finite set of monomials ju. 


Definition 3. Let an associative and commutative ring with unit ® be given, and let 
||-|| be a norm on &. Let X be a set and consider the free associative algebra R(X). 
Any function v: X¥ —> R can be used as the seed function of a corresponding v- 
degree norm. Since X* is the free monoid on X, the function v extends uniquely to 
a monoid homomorphism (X*, -) —> (R, +). Then the v-degree norm on ®(X) is 
defined by 


> rf) = max ele (5) 
om pex* 
pe 


where the maximum is surely attained since |r,,| can be nonzero only for finitely 
many jt € X*. 


The trivial norm is recovered when v(x) = 0 for all x € X. Variables x for which 
v(x) < 0 are power-series-like in that higher powers get smaller (in norm) whereas 
variables with v(x) > 0 behave more like the variables of an ordinary polynomial 
ring. 


Definition 4 Let ® be an associative unital ring with norm |-|. Let 1 be an R-module 
with norm ||-||. 

A subset U C M is an open subset in the topology induced by the norm ||-|| 
if there for every a € U exists some real number ¢ > 0 such that any b € M with 
lla — b|| < € satisfies b € U. 

A sequence {a,}"°, C M is said to be a convergent sequence if there exists 
some b € M, called the limit of the sequence, with the property that there for every 
€ > O exists some integer N such that ||a,, — b|| < ¢ foreveryn > N. A limit point 
of a set U C M is some a € M with the property that there for every ¢ > 0 exists 
some b € U \ {a} such that ||a — b|| < ¢. Aset U C Mis closed set in the topology 
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induced by the norm ||-|| if every limit point of U is an element of U. The topological 
closure of U C M, denoted U, is the smallest closed subset of M that contains U. 


The usual laws of general topology hold with the definitions above: a subset 
is closed if and only if its complement is open, an arbitrary union of open sets is 
again open, but only finite intersections of open sets will necessarily be open, and so 
on. Using more properties of a norm, one may show that the algebra operations — 
addition, subtraction, ring multiplication, scalar multiple, and even the norm itself — 
are all continuous with respect to the topology the norm induces; the algebraic and 
the topological structures play very nice together. 

On the matter of continuity, it is also worth pointing out that the condition for 
this can be simplified considerably in the case of linear maps: an R-linear map 
f: M — M is continuous if there for every ¢ > 0 exists some 5 > 0 such that 
any b € M with ||b|| < 6 satisfies | f(b) | < €; what happens is that continuity at 0 
implies (uniform) continuity everywhere. The same thing happens for equicontinuity; 
normally a family F of maps 11 —> M is said to be equicontinuous if there for 
every € > 0 exists some 6 > 0 such that it for all f € F and a,b € M satisfying 
|b — al| < 6 holds that | f() _ f(a| < €, but when all the maps in F are linear 
(indeed, it suffices that they are homomorphisms of the additive group) it is sufficient 
to require this for a = 0. 


Definition 5 Let ® be an associative unital ring with norm |-|. Let { be an R-module 
with norm ||-||. A sequence {a,,}°° , C M is said to be a Cauchy sequence if there 
for every €¢ > 0 exists some integer N such that ||a, — a,|| < ¢ for allm,n > N. 
The set M is said to be topologically complete if every Cauchy sequence in it has a 
limit in Ml. A subset S C M is said to be dense in M if there for every a € M and 


every € > 0 exists some b € S such that ||b — all < e. 


A key component in the power series algebra construction is the standard con- 
struction of the completion of a normed module (which can also be carried out in 
the greater generality of a metric space or alternatively that of a topological abelian 
group), which given any module M produces a topologically complete module M 
containing M as a dense subspace; writing MM for the completion is borderline an 
abuse of notation, but as soon as one accepts that the completion exists as a topo- 
logical space and contains M, then it follows from M being dense in the completion 
that the completion is equal to the closure M. A nice feature of the completion is that 
any continuous map from the original set MM to a topologically complete set extends 
uniquely to a continuous map defined on the whole completion. This can be used 
to extend the algebraic operations to the completion. Moreover, the continuity then 
implies that they still satisfy all the algebraic identities they had before extension, 
so the completion & of a normed ring & is again a normed ring, the completion 
of a normed R-module M is again a normed R-module, and the completion A of a 
normed R-algebra A is again a normed R-algebra. 

Set-theoretically, the completion of J can be constructed as a set of equivalence 
classes of Cauchy sequences in , where two sequences {a,}°°, and {b,}°2, are 


n=1 
equivalent if lim). 59(d, — b,) = 0. The elements of are then not strictly elements 
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of the completion, but there is a canonical embedding of M into M that maps a € M 
to the equivalence class of the Cauchy sequence where all elements are a. 


Definition 6 A module (or ring) norm ||-|| defined on some module M (or ring R) 
is said to be a (module/ring) ultranorm if it satisfies the strong triangle inequality 


lla + bI| < max{|lall , |b1)} (6) 


for alla, b € M (or &). 


The trivial norm is obviously an ultranorm. A v-degree norm will also be an 
ultranorm whenever the norm on the scalars is an ultranorm. Ultranorms are some- 
times said to be non-Archimedean, since they have the property that any sequence 
{ rat a of integer multiples of an element a will be bounded. Classical alge- 
braic examples of ultranorms are provided by the p-adic valuations on Q (and more 
generally on the field of p-adic numbers Q,). 

A very striking property of ultrametric topology is the following “freshman’s 
dream:” 


Lemma 1 /f M is the completion of the module M with respect to an ultranorm |\-|\, 
then the extension of this norm to M is also an ultranorm and a series paneer dy with 
terms in M converges if and only if limMy-+oo ||@n|| = 0. 


In a sense, that is how one wants formal power series to behave: that there be no 
risk of divergence due to interactions between terms. The price one pays for this is 
however that the space becomes fotally disconnected: every open e-neighbourhood 
{ beM | ||b-—all <e } of a point a is also topologically closed (since it is the com- 
plement of the union of all e-neighbourhoods that do not contain a)! A good intuitive 
understanding of what an ultrametric space looks like can be had by imagining it as 
a Cantor set. 

Topology aside, there is a concept of algebraic providence that is quite close to 
that of an ultranorm, namely that of a valuation, so it should be sorted out how the 
two compare. One advantage of norms is that the notation is more standardised. 


codomain Norms assume values in R, whereas the definition of valuations typically 
permit an arbitrary totally ordered group, or even semigroup [6], as codomain of 
the valuation map. 
This may seem like a significant generalisation, but in practice it is not. The 
reason is mainly that a total order on a semigroup gives rise to a canonical order- 
preserving homomorphism to the real numbers [4, Theorem 3.40], essentially 
via the Eudoxian theory of proportion. (Book V of The Elements, with Euclid’s 
original proofs rather than modern arithmetic substitutes, becomes much more 
interesting if one allows oneself to consider magnitudes for which addition need 
not be commutative.) 
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direction of order For norms, it is a well established standard that small elements 
have small norms (as measured by the standard order on R). For a valuation V, it 
rather depends on the author; we find some writing V(a) > V(b) to mean that b 
is smaller than a, whereas others take it to mean that a is smaller than b, and the 
strong triangle inequality might be written 


Via+b) < max{V(a), V(b)}_ or Via +b) > min{V(a), V(b)} 


with the latter probably being more common. 

notation for group operation For norms, it is a well established standard that mul- 
tiplication in the ring corresponds to multiplication in R, as in the inequality 
|ab|| < |la|| ||b||. For valuations, there is a variation in that some authors denote 
the group operation as addition whereas others denote it as multiplication. Here, 
it is addition that probably is the more common convention. 

treatment of zero For norms, there is a clear convention that the norm of 0 is 0. For 
valuations, one may either leave the valuation V a partial function not defined for 
O, or adjoin an extra element to the group to serve as V (0). Under the small element 
has big value convention, it may be convenient to name that extra element oo. 

equality For valuations (and assuming the additive convention), there is a strong 
preference that the multiplication axiom should be an equality: V(ab) = V(a) + 
V(b); a consequence is that the existence of a valuation implies the absence of 
zero divisors. For norms, it is rather quite common that the multiplication axiom 
is an inequality, and a notable feature if something satisfies it with equality. 
In the present construction, it is only at the very end that equality turns out to hold 
in the multiplication axiom for norms, so it makes sense to work with a concept 
that does not seem to imply it from the start. The main reason that one cannot 
assume equality is the step of forming the quotient. 


The main advantage of using norms here and now is rather that they have real numbers 
as values, because elementary mathematics lets us do so much with real numbers; it 
is trivial to state | BE A! | = 2/¢+*8 and require that o/ is irrational. 


Definition 7 Let M bea normed R-module with norm ||-||, and let N be a submodule 
of M. Then the quotient norm ||-||y¢/x on M/N is defined by 


la + Nilacyn = int lla + cll for alla € M. (7) 
c&E 


The quotient norm is an R-module norm if and only if N is topologically closed. 
It will be an R-algebra norm if M is a normed R-algebra and N is a two-sided ideal. 

The final piece of terminology in Theorem | that needs to be defined is that about 
the orthogonal basis. Here it is useful to first write down a definition of a Hilbert 
basis, since that is the basis concept that is predominant in this chapter. The concept of 
Hilbert basis should be contrasted to that of a Hamel basis, where one only considers 
finite linear combinations of basis elements. 
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Definition 8 Let ® be an associative ring with unit and let M be a topological R- 
module. Let Z C M be arbitrary. Recall that the notation Span(Z) denotes the set 
of all finite linear combinations of elements of Z. It is often convenient to have a 
simple notation for the topological closure of this set as well. Therefore define 


Cspan(Z) = Span(Z) for all sets Z. 


Linear independence also needs a topologized counterpart. Define the set Z to be 
topologically linearly independent if it is linearly independent and every countably 
infinite sequence {j;}%°, of distinct elements from Z is such that: 0 € M is a limit 
point of the sequence 1 Tithi) nepees where {7;}2, C &, if and only if r; = 0 for 
alli € Z.o. The set Z is said to be a Hilbert basis for M if it is topologically linearly 
independent and J) = Cspan(Z). 

In many cases, the most convenient way of showing that a set is a Hilbert basis 
is to show that it is an orthogonal basis. Contrary to popular opinion, the concept of 
orthogonality does not require an inner product; it can be defined in arbitrary normed 
spaces. The theory of orthogonality in normed spaces is however in many aspects 
different from the theory for inner product spaces. In particular the focus is shifted 
from elements to sets. 


Definition 9 Let M be an R-module with norm ||-||. A submodule N, C M is said 
to be orthogonal to a submodule Nz C M if |la+ b|| > |la|| for all a € Ny and 
b € No. A subset Y of M is said to be orthogonal if for every bipartition Y, U Y2 of 
Y (YN Y2 = @) the module Span(Y,) is orthogonal to Span(¥2). 


An important example of an orthogonal set is the set X* of monomials in the 
free algebra R(X), when that is normed by a v-degree norm. A Hamel basis Y of a 
module M that is orthogonal with respect to the norm on M will be a Hilbert basis 
of the completion M. 

One notable property of orthogonal bases that the present norm-derived concept 
shares with its counterpart in Hilbert spaces is the existence of an associated dual 
basis, or more informally of “Fourier coefficients” for every module element. Con- 
cretely, let R be a topologically complete normed associative ring with unit and 
M be a normed R-module with orthogonal basis Y. Then there exists for every 
pu € Y acontinuous R-module homomorphism /,, : M —> R such that Su(e) = 1 
and f,,(e) = 0 for all p € Y \ {2}; the dual basis consists of the family {f,} ey of 
these maps. The continuity of these maps is relied upon in (8) below, to prove that 
reductions are also continuous. 


3 Rewriting and the Diamond Lemma 


Rewriting is usually classified as a branch of computer science, but it touches upon 
fundamental logic enough to be relevant for all of mathematics, especially combi- 
natorial algebra. What it contributes is in particular a framework for making certain 
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operations effective and thus decidable, where the traditional constructions of abstract 
algebra would only produce an infinite set with no obvious algorithm for deciding 
membership. The application of rewriting that is of interest here is called equational 
reasoning, and addresses the quotient operation; we shall in particular deal with the 
quotient of an algebra by a two-sided ideal. 

A key feature in rewriting is the use of rewrite rules, which abstractly is a relation 
— stating that the left hand side “may be changed into” the right hand side. In the 
case of equational reasoning, the external justification for having a particular rule is 
that both sides of the relation are equivalent, so applying a rule preserves everything 
of interest. On the other hand, there is also an expectation that the right hand side 
(in some, not necessarily obvious, way) is simpler than the left hand side, so that the 
application of a rule can be viewed as a step of algebraic simplification. 

Rewriting comes in many flavours, distinguished by what the basic objects are 
that one rewrites. The one that corresponds to associative algebra (but also group 
theory) is called word rewriting since it operates on words, which in this case are 
defined to be finite sequences of symbols from some ground set X; those familiar 
with programming might find it more intuitive to read ‘word’ as ‘string’, since that 
is essentially what it is. For X = {a, b}, the first couple of words are 1 (the empty 
word, a sequence of length zero), a, b, aa = a’, ab, ba, bb = b”, a3, and so on. 
The set of all words on X is in abstract algebra known as the free monoid on X, and 
conversely the operation of concatenating two words is denoted as multiplication 
because concatenation is the multiplication operation in the free monoid. 

A standard trick for rewriting, when one aims to show something about an R- 
algebra of some kind [1, 2], is to work not with the bare words, but with formal 
linear combinations of words; rewrite rules then end up transforming elements of 
R(X) into other elements of R(X). In the present setting, where the quotient to 
examine is not one of R(X) but one of its completion R(X), it is necessary to take 
that one step further and add also a topological structure to the objects being rewritten. 
The end result is however not too bad, since the three structures (monoid, linear, and 
topological) combine quite nicely. 

Getting more into the technicalities, it is convenient to consider a formalism 
where a rewrite rule is a pair (uw, a), where the left hand side yz is a word, but 
the corresponding right hand side a can be an arbitrary element of R(X). A rule is 
allowed to act upon any element of R(X) where jz occurs as a subexpression. 


Definition 10 A rewrite system for R(X) is a set S C X* x R(X). The elements 
of S are called rules. Given any rule s € S, the first component of s (also called the 
left hand side) will be written 44, and the second component (also called the right 
hand side) will be written a,; thus s = (ts, ds). 

For every rewrite system S is defined the corresponding ideal J(S), which is the 
least topologically closed two-sided ideal in R(X) that contains {us — as; |s € S}. 


The rewrite system that will be of interest here has 8 rules, which for consistency 
with [4] will be named s; through sg. We have almost seen five of those rules already, 
namely 
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n mj 
s, = (ab, gba+c) where c= Mri I] bi ali, 
i=l j=l 


s = (bb. 1), ss = (aa, 1), 
3= (bb, 1), 56 = (aa, 1) ; 


the five elements (2) generating the ideal by which we wish to quotient are exactly 
[ls, — Gs, fori = 1,2, 3,5, 6. The remaining rules are, with the same c as in 51, 


so = (ab, q ‘ba - q'beb) ; 
57 = (ab, q~'ba— q7'aca) ,, 
ee (ab, qba+ abcba) 
These are needed for technical reasons that will be apparent later, but 
I({s1, 52, 83, 84, 85, 86, 87, 8g}) = I({51, 52, 53, 55, So}) 


so they do not change the constructed quotient algebra; concretely 


Us, ~ as, Ss q 'D(ts, ~ d;,)b ~ (Ms, ~ ds,)ab So q ‘bats, ~ as, > 
Ms, — as, = q Als, — as, a — ab(tUs, — as; ) + ce (Ms, = ds, )ba, 
Ms, — As, = — GA(Us, — ds,)A — AD(Us, — ds.) + q (Us, — as,)bA. 


Collectively, the purpose of rules s; through sg is to provide a rewrite simplification 
for every monomial that does not fit the PBW pattern— in this instance that pattern 
is B‘ A’, so “first everything B, then everything A” — by on the one hand moving 
any aor ato the left of ab or bto the right side of it (rules 51, 54, 87, and sg) and on the 
other hand making a b adjacent to b or a adjacent to a cancel each other out (rules 52, 
53, 55, and 56). Rule sg might seem like it partially fails to do this, on account of the 
ab factor in the second term of its right hand side, but it will be all right in the limit. 
The more general pattern is that one needs two rules for every named generator of 
the algebra (ss and s¢ for A, 52 and s3 for B), and four rules for every commutation 
relation. Rewriting can be used to study also algebras whose defining relations are 
not simply commutation relations, but then the resulting basis will typically not be 
of PBW-type. 

Of course, whereas the concept of some jz occurring as a subexpression might 
seem clear for something written down on paper, it is not obviously applicable for 
general elements of the completion R(X). Hence it is convenient to also have an 
alternative presentation in terms of a family of maps called reductions. 


Definition 11 Let S be a rewriting system. Let s = (js, as) € S be an arbitrary rule 
and let A,  € X* be arbitrary monomials. Let t,,,: R(X) —> R(X) be defined by 
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that 


tisp(b) =b + fin,p(b)A(ds — fs) for all b € R(X) (8) 
(where fi, denotes one of the maps in the dual basis for R(X)). This function 
tysp Can alternatively be characterised as being the unique continuous R-module 
homomorphism R(X) —> R(X) which satisfies 


Adsp if w= Apsp 
lL otherwise 


tasp (KL) = | 
for all w € X*. 
Let To(S) = {id}, where id: R(X) —> R(X) is the identity map. Let 
T\(S) ={tisp |A, 0 € X* ands € S}. 
Recursively define 
Tri (S) = {ti ot) |t € T(S) and tp € T,(S) } 
for alln € Z*. Set 
T(S)=|J 7.0). 
neN 


The elements of T(S) are called reductions and the elements of 7,(S) are called 
simple reductions. If t(b) = b for some b € R(X) then the reduction t € T(S) is 
said to act trivially on b. 


In the reduction formalism, the counterpart of stating that b rewrites to b’ is that 
there is some reduction t for which t(b) = b’. The fact that the simple reductions 
only act nontrivially on one monomial each and also only rewrite one occurrence 
of the rule’s left hand side there implies that rewriting by reductions gives a very 
fine control of which rewrite steps are taken, something which will be useful later. 
There exist alternative rewriting formalisms which offer less control, for example 
requiring that all occurrences of a left hand side are recursively rewritten in each step 
(so-called generalised division) or requiring that it is always the leftmost occurrence 
of a left hand side in a monomial that is rewritten; such variations may remove 
some complications from the theory but introduce others (e.g. a leftmost occurrence 
rule may make it unclear whether the corresponding ideal automatically becomes 
two-sided). 

It follows from the way reductions are defined that a — t(a) € J(S) for every 
a € R(X) and t € T(S). The ultimate point of the rewriting process is to exhibit 
a simplest expression for any element of the quotient algebra, or more technically, 
to find a unique representative in R(X) for every equivalence class in the quotient 
R(X) f J(S). The way to recognise these representatives is that all reductions act 
trivially on them; if one cannot rewrite them to anything else, then apparently there 
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is not anything simpler. Or can there be? In general, there are indeed ways that it 
can go wrong! The Diamond Lemma provides a checkable set of conditions which 
ensure that everything works out right. 


Definition 12 Let a rewriting system S be given; the concepts defined below are all 
with respect to a particular rewriting system. Denote by B,(b) the e-neighbourhood 


{a € R(X) | lla — bl < e} of some b € R(X). 

Ana € &(X) is said to be irreducible if t(a) = a for all t € T(S). The set of all 
irreducible elements in R(X) is denoted Irr(S). 

Ana € ®(X) is said to be stuck in F C R(X) if t(a) € F for all t € T(S). An 
a € R(X) is said to be persistently reducible if there for every t; € T(S) ande > 0 


exists some f9 € T(S) and b ¢€ Irr(S) such that t (t (a)) is stuck in B,(b). The set of 
all elements in R(X) that are persistently reducible is denoted Per(S). 

Ana € &(X) is said to be uniquely reducible if, for all t,,  € T(S), bj, bo € 
Irr(S), and ¢ > O such that t;(a) is stuck in B,(b;) and f)(a) is stuck in B,(b2), it 
holds that ||b; — b2|| < €. The set of all a € R(X) which are both persistently and 
uniquely reducible is denoted Red(S). 

A rewriting system S for which Red(S) = R(X) is said to be confluent. The map 
t5: Red(S) —> Irr(S) is defined by that, for any a € Red(S) and e > 0, there exists 
some t € T(S) such that ¢(a) is stuck in B,(t*(a)). The element 1° (a) is called the 
normal form of a. 


The map f° constitutes a kind of limit of the set of all reductions T (S); if t5(b) = b’ 


then D’ is the unique limit point in Irr($) of {t 1) pee (s)" For a confluent rewriting 


system, t* becomes a projection of R(X) = Red(S) onto Irr($) and ker tS = J(S). 
Hence Irr(S) is in that case isomorphic to the quotient R(X) / JCS) as an R-module, 
but much simpler to describe. In the case S = {5, 82, 53, 54, 55, 86, 57, Sg}, 


Inr(S) = Cspan({ bia! b'a/, bial bla’ ji. c Zo} U{bi,b' ai, a! |i E Zo} vin), (9) 


which is how it will follow that R(X) /J(S) has a basis on the form {BA}; 1¢z. But 
it still remains to prove that this rewriting system S is confluent. 

The main obstacle is to prove unique reducibility, but there are some smaller ones 
that will need to be dealt with before that. First, it is useful to observe that || ,2, || = 
las || for alls € S = {5), 59, 53, 84, 55, 86, 57, Sg}; this is no accident, but the result of 
a deliberate design. For the formal inverse rules 52, 53, 55, and s¢, it straightforwardly 
follows from v(b) = —v(b) and v(a) = —v(a). For the commutation relation s,, 
it instead follows from ||ab|| = ||gbal| > ||c||, and the inequality here is a direct 
consequence of the condition in Theorem | about a straight line separating (1, 1) 
from a bunch of other points. The significance of (1, 1) here is that it is the (bi)degree 
of ab and ba, whereas the other points are the (bi)degrees of the terms making up c. 
The constants @ = v(a) and 6 = v(b) were chosen so that ||c||_< ||gbal], which by 
the strong triangle inequality makes las, | = ||qbal|. (That w/ is irrational is not 
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important for this step, but will be important later.) The corresponding calculations 
for s4, 87, and sg are slightly embellished forms of that for 51. 

A consequence of this is that ||r(b)|| < ||b|| for all b € R(X) and ¢ € T(S) (first 
prove it for t € 7;(S) using (8); it is not possible to get equality since terms may 
cancel). Ina confluent system, this implies that the infimum over a € J(S) of ||b + al 
is attained for b + a = t5(b), so the quotient norm on R(X) /I(S) can be calculated 
from the norm of the normal form representative. This completes the claims in point 5 
of Theorem |. 

A further consequence is that T (S') is equicontinuous, which is a technical require- 
ment in the topologized Diamond Lemma. Among other things, it ensures that Per(S) 
and Red(S) are topologically closed, and that r° is continuous. It follows already 
from their definitions that Irr(S), Per(S) and Red(S) are R-modules, that Irr($) is 
topologically closed, and that t* is an R-module homomorphism and projection. 

The second obstacle is to prove that Per(S) = R(X), which morally constitutes 
the claim that there for every b € R(X) and ¢ > 0 exists some sequence of rewrite 
steps which will remove all non-irreducible terms larger than ¢ from b; technically the 
definition of persistent reducibility has some extra twists to it, but those are there to 
make it more convenient in proofs. The way that this is proved is by induction over the 
monomials, since it given the above follows from X* C Per(S) that Per(S) = R(X). 

More precisely, the induction is a form of well-founded induction, so it is carried 
out with respect to a partial order P on X*. This partial order is what determines 
what it means for one expression or element of R(X) to be ‘simpler’ than another. 
A rewrite system S is said to be compatible with a partial order P if its right hand 
sides are smaller than its left hand sides. 


Definition 13 Let P bea partial order on X*. The down-set module of some pz € X* 
with respect to P is the set 


DSM(u, P) = Cspan({p € X*| p < win P}), 


where ‘po < jz in P’ means ‘p is strictly less than jz according to the partial order 
P’. Arewrite rule s = (1s, a,) is said to be compatible with P ifa, € DSM(u,;, P). 
A rewrite system is compatible with P if all rules in it are compatible with P. 


What is needed for proving persistent reducibility is however that t(j) € {uu} U 
DSM (wu, P) for all w € X* and t € T(S). Transitivity of P makes this follow for 
general reductions once it has been established for simple reductions f,;,, but the step 
from rules to simple reductions require a bit more from the partial order P, namely 
that it is preserved under “padding” by arbitrary monomials 4 and p. 


Definition 14 A partial order P on a semigroup §$ is said to be a semigroup partial 
order if for all 4, v, A € S it hold that ~ < v in P implies Ay < Av in P and wa < 
vA in P. 


The construction of semigroup partial orders with which a given rewriting system 
can be compatible is something of an art in itself, but a useful technique is to layer 
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different ordering criteria on top of each other, so that if the first ordering criterion 
does not distinguish two elements then the second is tried, and if that too considers 
them equal then a third is used, and so on. A convenient way to formalise this is to 
describe each layer as a semigroup quasi-order from a “toolbox” of simple generic 
constructions; the detail fitting to a particular rewriting system is achieved by on the 
one hand choosing parameters in the definitions of these quasi-orders, and on the 
other choosing how to combine the quasi-orders. See [4, Sect. 3.4] for a detailed 
treatment of this. 

In the present case, there is one more concern that needs to be taken into account 
when designing P, namely the well-foundedness that is the basis for the induction. 
Technically, the condition that needs to be satisfied is the following. 


Definition 15 If P is a partial order on some M C R(X) such that every strictly 
P-descending sequence {p,}"2, C M— thatis, p, > Pn41 in P foralln € Z9— 
satisfies || 0, || — 0 as mn — ov, then P is said to satisfy the descending chain con- 


dition in norm, or to be DCC in norm for short. 


This descending chain condition supports induction on the following form: if 
L C M is aset such that 


(basis) any p € M with ||p|| < e satisfies o € L, and 
(step) if o € M is such that any o < —¢ in P satisfies o € L (a kind of condition 
close to membership in DSM(p, P)), then p € L 


then L = M. The descending chain form is however often more intuitive in an algo- 
rithmic setting: rewriting 0; might produce 2, which in turn might rewrite to (3, 
and so on; then the descending chain condition implies that in the limit the non- 
irreducible terms vanish, or more technically that a finite number of rewrite steps 
suffice for getting rid of all non-irreducible terms of norm larger than some arbitrary 
é > 0. The linear structure adds some complications in that a rewrite step getting 
rid of one term may introduce several new terms, which will require tracing several 
descending chains, but this ends up not being a problem. 

Thus having presented all the constraints on the partial order P, it is time to 
present its exact definition. First, one must choose some real number 6 > 1 such that 
Ic||@ < |jab]|. Then 4 < p in P for some 1, p € X* in the following three cases: 


- elle < lel, 
2. |\|| = |||] but the length of the word yz is strictly less than the length of the word 
p, or 
3. |||] = lll] and yz is the same length as p, but 4 comes before p in the word 


lexicographic order which hasb <b <a<a. 


Compatibility-wise, case 1 takes care of the second terms of ds,, ds,, ds,, and 
ds,, 1.€., it implies that c € DSM(us,, P), —q~'beb € DSM(us,, P), —q7'aca € 
DSM(us,, P), and abcba € DSM(us,, P). Case 2 takes care of ds,, ds,, ds;, and ds, 


since 1 has word length 0 whereas 1, = bb, j4,, = bb, ju, = aa, and w,, = aa all 
have word length 2. Finally case 3 takes case of the first terms of ds,, ds,, ds,, and 
ds,; the referenced lexicographic order orders the length 2 monomials as 
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bb < bb < ba < ba < bb < bb < ba < ba < ab < ab < aa < aa < ab < ab < aa < aa 


although for the composite order P it is likely that case | has preference for pairs of 
words that contain different sets of letters. 

That the partial order P so defined is a semigroup partial order follows from its 
presentation as a lexicographic composition of semigroup quasi-orders, and will not 
be shown explicitly; see [4, Sect. 3.4] for the details of how it is done. More interesting 
is the way in which P gets to be DCC in norm. Basically, the combination of cases 2 
and 3— the so-called length-lexicographic order — ends up being a well-order of 
X*; hence any infinite P-descending chain must have infinitely many steps at which 
the strict descent is ruled according to case 1. Whenever that happens, the norm must 
decrease by a factor @~! < 1, so in the limit the norm tends to 0. But why must this 
0 be explicit? 

Had a/ been a rational number, then the set of possible norms would have been 
discrete, and the quotient between two distinct norm values would have been bounded 
away from 1; it would have been possible to state case | as ||12|| < ||p|l, since that 
would automatically have implied ||,2|| @ < ||e|| for some fixed 9 > 1 depending on 
a and f. But when a/f is irrational the set of possible norms becomes dense in Rx 9; 
without an explicit minimal step 6, the order P would not have become DCC in 
norm. This has the consequence that P does not relate two elements of distinct but 
almost equal norm, so P is not a total order. This comes with a slight penalty, in that 
it precludes the use of some rewriting formalisms, in particular the standard bases 
formalism of Mora [6], for analysing the present power series algebra construction, 
but that seems unavoidable. Indeed, it turns out that several claims in Theorem | 
are true precisely in those cases which cannot be analysed using a monomial order 
that is both total and DCC in norm! Hence the use of partial rather than total orders 
really provides a practical advantage. 

The above has cleared the second obstacle to confluence, so we can now go on 
to the main obstacle, which is the uniqueness of the normal forms. With respect to a 
random rewrite system, it is quite possible to find a, b; = t,(a) for some reduction 
t), and by = t2(a) for some other reduction fy such that b} A bz are both irreducible; 
the rewrite system {s), 52, 53} with c = 1 exhibits this for a = abb. That there can in 
general be several different reductions which act nontrivially on an element facilitates 
the formation of such forks in the rewriting process. 

The observation on which the Diamond Lemma is based is that such forks are not 
a problem unless they are final; as long as there is some common successor in both 
paths, we have not yet made an irrevocable decision on whether to go to b; or to bo. 
Hence if no decision ever was final, then there could not be two distinct normal forms 
to choose between, and thus the normal forms would have to be unique! The classical 
‘diamond condition’ (also known as local confluence) for which the Diamond Lemma 
is named states that for every element a (the apex of the diamond; ‘diamond’ here 
simply means the geometric figure of a quadrilateral standing on a corner) and pair 
of simple reductions ¢;, f2 acting nontrivially on a there exists general reductions 
t3, t4 such that f3 (t (a)) = 4 (h (a)) ; the elements a, f; (a), and f2(a) are the top three 
corners of the diamond, whereas f3 and ft, contributes two additional sides that meet 
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at the fourth corner fh (t1 (a)) = (t (a)), thereby “closing the diamond”. Though 
strictly speaking, in this topologized setting one can only count on the two sides 
getting arbitrarily close, so the condition rather has to be that there for every e > 0 


exists general reductions £3, t4 such that I's (t (a)) —t4 (to (a)) | < e. There are as 


always some technicalities involved, but by combining the diamond condition with 
induction over the monomials one can prove the unique reducibility of all elements 
of R(X). 

The way in which one verifies the diamond condition is ultimately to do explicit 
calculations, but it would be impossible to do so for all a € R(X). It is, in view 
of how induction was used to establish persistent reducibility, probably no surprise 
that it suffices to verify the diamond condition for monomials a, but that would still 
leave infinitely many cases to check. Each such case would however be of the form 
that a monomial jz is acted nontrivially on by two simple reductions ¢, and tf; — an 
arrangement which is called an ambiguity. Since f; and f2 are simple reductions, they 
can be expressed more explicitly as t) = fy,5,, and f2 = fy,5,,, and since they both 
act nontrivially on jz it must be the case that Aj js, 91 = LW = Arps, P2. Additional 
restrictions can be imposed on the monomial factors 1, [s,, P1, A2, s,, and po, 
because a lot of the separate cases that can be considered are in fact “padded” 
versions of a simpler case, where any common prefix of 4; and A2 (and similarly any 
common suffix of 0; and 2) has been shaved off; thanks to the fine control provided 
by the minimalistic definition of reductions, it is always possible to insert arbitrary 
padding into all four sides of a known diamond. 

In the end, it turns out that the only cases one needs to check explicitly can be 
specified as a quintuplet (5;, 52, Yj, v2, v3) Where s1, sy € S arerules and vy, v2, v3 € 
X* are monomials. The product v;v2v3 is the apex corner of the diamond, 1, is 
the part of this monomial which is acted upon by s; but not sz, v2 is the part of this 
monomial which is acted upon by both rules, and v3 is again a part acted upon by only 
one of the rules (usually s2 in practice, but s; is theoretically possible). Hence either 
[ls, = Vi V2 and fs, = V2V3, with the two simple reductions being t45,,, and t,,5,1, 
OF [4s, = V1 V2V3 and fs, = v2, with the two simple reductions being 45,1 and fy, 551,- 
This means in particular that there for any given pair of rules (51, sz) is only a finite 
number of ambiguities that need to be checked explicitly, so for any finite rewrite 
system S§ the number of cases to check is finite. For S = {51, 52, 53, 84, 85, 56, 57, Sg}, 
there turns out to be twelve of them. 

What does it mean to perform such a check, though? It would suffice to produce 
reductions £3 and t4 to close the diamond, but in the case of this reduction system it 
turns out that this in some cases requires going to the limit, which is a bit awkward 
presentation-wise and also would require dealing with the explicit value of c. There 
is an alternative condition called relative resolvability which as far as the Diamond 
Lemma is concerned suffices just as well, but which requires introducing a few more 
concepts. 


Definition 16 Let S be a rewriting system for R(X) and let P be a partial order 
on X*. The down-set ideal section of p € X* with respect to P and S is denoted 
DIS(p, P, S) and is characterised as being the least topologically closed 
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R-submodule of R(X) that contains all A(z; — as)v such that Awsv < ep in P for 
A,v € X* ands € S. An ambiguity (t,, 4, f) is said to be resolvable relative to P 
if t1(“) — h(u) € DIS(u, P, S). 


Informally the difference between relative resolvability and the ordinary resolv- 
ability using a diamond is that instead of having a lower half with two sides meeting 
each other in one minimum, the two horizontal extrema are joined with a jagged 
line, that sometimes goes up and sometimes down; this will still be fine provided all 
intermediate peaks are strictly below the apex of the upper half (as measured by the 
partial order P). 

Practically, a demonstration of relative resolvability can be presented in lines on 
the form 


(81, 52, Vj, V2, V3) ds,V3 — Vyas, = simplified = DIS-form 


where the quintuplet identifies the ambiguity (445,,,, 1V2V3, ty,s,1) being resolved. 
Then comes an expression which is the difference f45,,,(V1V2V3) — ty,s)4(V123), and 
in the next step that is simplified. The final step gives a presentation of that same 


element of R(X) which makes it obvious that it belongs to DIS(v, 203, P, S$). More 
precisely, it is expressed as a linear combination of terms where each term contains 
a factor e; := [Ws, — ds,, and also each term is labelled with a reason why this term 
is in DSM(v112v3, P): ‘Norm’ refers to case | on p. 46 and ‘Lex’ refers to case 3; it 
turns out case 2 never comes into play in these comparisons. 


(s1, s2, a, b, b) 
(s2, 53, b, b, b) 
(93, 52, b,b,b 
(s4, 53, a, b, b) 


WKS 


(s5, 56, A, A, A) 


(85, 87, a, A, b) 


(s5, Sg, a, a, b) 


£ 
oT 
Ss 


(s6,81,4 


(s6, S4, a, a, b) 


(56, 85, A, a, A) 


(97, 52, a, b, ) 


ds, b- ads, = qbab + cb — a = qbeq + e2a— eacb 


Lex Lex Norm 

ds,b — bas, =b-—b=0 

i bib b= b=0 

ds, — Aas, = q~'bab - q~'bebb -a= q be +e3a— q 'bee3 


Lex Lex 


Norm 
ds, — Ads, =A—A=0 


1 =! 1 


ds; — ads, = b — q taba +q° aaca = —q eja—bes+q esca 
Lex Lex Norm 

ds;b — Ads. = b— gaba — aabcba = —qeqa — bes _ esbcba 
Lex Lex Norm 


ds,0 — aas, = b — gaba — ac = —qge7a — beg + aces 


Lex Lex 


Norm 


ds, — ads, = b— q aba + q_abcb = 
= —q~lega — beg — q~abcbeg 
Lex Lex 


Norm 
ds,a— Aas, =A-—AaA=0 
ds, — ads, = q_'bab - q_ acab -a= 
= q~'beg + e278 — e7bcba + Aercba — q~ !Aceg 
Lex Lex 


Norm Norm Norm 
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(sg, 53, a, b, b) as,b — Aas, = qbab + abcbab — 4 = 
= qbe7 + e34+ abcbe7 + g7!abce3a + q7legca 


Lex Lex Norm Norm Norm 


This fulfils the last condition in the Diamond Lemma, so it is hereby established that 
R(X) = Irr(S) @ JCS) and hence that the quotient R(X) /I(S) = Irr(S) as a vector 
space. 

For clarity, it might however be worth giving a more exact reference to the par- 
ticular Diamond Lemma that would be used. Theorem 3.30 of [4] will suffice, but it 
should be pointed out that the terminology used in this section is probably more in 
line with that of the later paper [5] where the two disagree. It is alternatively possible 
to use the more general Theorem 5.11 of [5] to establish the above conclusion; in that 
case one would make use of [5, Sect. 7] to set up the topological framework and [5, 
Ex. 6.10] to analyse the ambiguities. In both those statements of a Diamond Lemma, 
the first and second obstacles above address explicit conditions in the theorem state- 
ment, whereas the “main” obstacle appears as one of several claims equivalent to 
confluence and R(X) = Irr(S) @ J(S). 

Another point that should addressed in this context is how the norm on the scalars 
affects the set-up of the Diamond Lemma machinery. As long as that scalar norm is 
trivial, it is sufficient to define P as a partial order on the set of monomials X*, but 
if it is not then P rather has to be defined as a partial order on the set of all terms ru 
where r € & \ {O} and yz € X*; this is needed because a term with large || ,|| might 
still be made small by a tiny |r| (and vice versa). This setting of ordering the terms 
is explicitly that which was used in [4], whereas [5] takes a more abstract route, but 
for this chapter it seemed an additional complication that readers for the most part 
were better off without. 


4 The Unreasonable Usefulness of Irrationality 


Having pushed through the rather daunting mounds of technicalities in the previous 
section, this seems like a good place to pause and evaluate where we are with respect 
to proving the claims in Theorem 1. The algebra A is constructed as R(X) [MS ), and 


it carries a quotient norm ||-|| inherited from the v-degree norm on R(X). The algebra 
A furthermore has two named elements A = a+ J(S) and B = b+ J(S). With that 
in mind, the claims were: 


1. The commutation relation (1) holds in A. 
That one has been obvious since Sect. |. 

2. The algebra A is a skew field, i.e., all nonzero elements in A are invertible. 
That one has conversely seen no progress at all. 

3. ||-|| és an ultranorm on A and |\a|| ||b|| = ||ab|| for alla, b € A. 
Here, there is partial progress. It was established in Sect.3 that the quotient 
norm on A is in fact equal to the norm on Irr($) (composed with the isomor- 
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phism between the two R-modules) so it will be an ultranorm, but the claim that 
l|a|| ||D|| = ||ab|| still needs to be proved. 

4, A is complete in the topology induced by |\-|\. 

This follows from the isomorphism with Irr(S). 

5. The set {BA x ez is an orthogonal Hilbert basis for A and | BKA! | = 2etkp | 
This, too, follows from the isomorphism with Irr(S). It is in many ways the main 
conclusion from applying the Diamond Lemma machinery. 

6. Every nonzero a € A has a unique leading term r BK A', i.e., there exist unique 
r € Rand k,l € Z such that lla —rB* A! | < |lal. 

This has not been proved, but we are now ready to do it! 


And the key to getting further is the irrationality of a/f that up to this point has 
rather been a complication. 

An immediate consequence of the ratio of a to 6 being irrational is that the 
map Z x Z —> R: (k,l) & la + kf is injective, and thus the map Z x Z —> R: 
(k, 1) be | BA! | is injective as well. Hence every element of the basis for A has 
a distinct norm, from which the unique leading term property immediately follows. 
Every a € A has, by the status of {B* A’}, jz as a Hilbert basis, a unique presentation 
on the form 

a= > ry, BeA' — for some {re r}erez CR. (10) 


For any ¢ > 0, the set of (k,/) such that [re BEA! | > e is finite (because oth- 
erwise the series (10) would not converge) and for every a £0 there is thus a 
unique term r;,B* A! that is maximal in norm. Since all other terms have strictly 
smaller norm, it follows from the strong triangle inequality that ||a|| = | rp Be A! | > 
lla = ry BEA! |. 

Unique leading terms, invertibility of basis elements, and invertibility of scalars 
is then all that is needed to prove invertibility of arbitrary nonzero elements, by using 
the old trick of viewing the formula for the sum of a geometric series as a formula 
for the inverse of things close to 1. Concretely, if ||a|| = |r Be A! | > lla —r Bea! | 
then 

a=rB*A' — (rB*A'—a)=(1-—(1—r'aA‘B™) -rB*A! (11) 


where 1 —r aA B+ | < |r Bea! = a|| | A! | pa | < 1 and thus 
[oe] 
r At B* (1 —r'aAt By (12) 
n=0 


converges. It follows from (11) that (12) is in fact the multiplicative inverse of a, and 
therefore any a € A \ {0} is invertible. Thus A is a skew field. 
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A second consequence of the expression (12) is that la-! | < |la||~!, since 


CO 
rl Aa! BO xe _ x 'g@A BY < 
n=0 


oo 
<b JA“ |B] [og = tata) = 
n=0 


= [rl !2-!2-F* WA] = |r BEA | = fall. 


On the other hand, 
-1 -1 -1 
1 = |) = jaa" | < lal fa" | < dali lat = 1, 
so it must in fact be the case that la~! | = |la||~!. This makes it easy to prove that 
l|a|| ||D|| = ||ab]| for all a, b € A. If a or b is O then that claim is trivial, whereas if 


a and b are both invertible then 


= = -1 
Hall Holl = Ja fo = (foe) < 


< ota ty 


= |@oy"' | 


= |lab|. 
forcing equality also in this case; the algebra A is valued in the sense that the 
norm gives rise to a valuation, in the strict interpretation which requires equality 
in the multiplication axiom. This concludes the proof of claim 3, and thus also of 
Theorem | as a whole. 

If instead of having the norm |-| on the field ® be trivial one allows it to be an 
arbitrary valuation (ultranorm satisfying the multiplication axiom with equality), then 
the degree conditions on the terms in the remainder c can be relaxed. There is still 
the requirement that ||c|| 6 < ||ab|| for some 6 > 1, but now that only boils down to 
Ir| |||] @ < |Jab]| for every term ry inc, and if |r| is small then that may compensate 
for || (|| being large, but also vice versa. The coefficient g in the term gba must satisfy 
|q| = 1, since rewriting requires both ||ab|| > ||gbal| and |ab| > |q-'bal. 

That & is a field really only becomes important in the penultimate step of proving 
that A is a skew field; the uniqueness of the leading term follows even for a general 
coefficient ring ®. That q is invertible is on the other hand necessary already for 
setting up the rewriting system, since rules s4 and s7 in a sense have the roles of left 
hand side and leading term of the right hand side reversed from what they are in s). 

From the point of view of non-power-series rewriting, it is somewhat surprising 
that the exact value of the “remainder” c turns out not to matter— as long as it is 
smaller (in norm) than the other terms ab and gba of the commutation relation, it 
can be whatever one wants! Part of this is due to only having two generators A and 
B, since having several commutation relations can lead to them interacting with each 
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other in nontrivial ways, but part of it is also due to the simplifying ability of power 
series; a finite sum >°7_) 7x" is only a polynomial, but an infinite sum >77° 9 74x" 
can reproduce any analytical function. 
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Computing Burchnall—Chaundy Polynomials 
with Determinants 


Johan Richter and Sergei Silvestrov 


Abstract In this expository paper we discuss a way of computing the Burchnall— 
Chaundy polynomial of two commuting differential operators using a determinant. 
We describe how the algorithm can be generalized to general Ore extensions, and 
which properties of the algorithm that are preserved. 


Keywords Ore extensions - Burchnall-Chaundy theory - Determinants 


1 Introduction 


It is a classical result, going back to [2-4], that all pairs of commuting elements in 
the Weyl (Heisenberg) algebra are algebraically dependent over C. This result was 
later rediscovered and applied to the study of non-linear partial differential equations 
[oy 1G: U2]; 

In this paper we will describe a method, the Burchnall—Chaundy eliminant con- 
struction, for computing explicitly the algebraic relation satisfied by two commuting 
elements, and consider the generalisation to the class of rings known as Ore exten- 
sions. We will describe results showing that the eliminant construction partially 
generalises. We will also give counterexamples showing that these generalisations 
do not always retain all desired properties. 
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2 Definitions 


We recall the following definition. 


Definition 1 Let R be a ring, o an endomorphism of R and 6 an additive function, 
R — R, satisfying 
5(ab) = o(a)d(b) + d(a)b 


for alla, b € R. (Such é:s are known as o -derivations.) The Ore extension R[x; 0, 6] 
is the polynomial ring R[x] equipped with a new multiplication such that xr = 
o(r)x + 6(r) for all r € R. Every element of R[x; o, 6] can be written uniquely as 
>, ax! for some a; € R. 

Ifo = idthen R[x; id, 5] is called a differential operator ring. If P = >)» a;x', 
with a, ~ 0, we say that P has degree n. We say that the zero element has degree 
—0O. 


Ore extensions were defined by the Norwegian mathematician Ore [13] as a non- 
commutative analogue of polynomial rings. 


Definition 2 The Wey/ (or Heisenberg) algebra, can be defined as the Ore extension 
CLy][x; id, 5] where 6 is the usual algebraic derivative on CLy]. 

The g-deformed Wey] algebra can be defined as the Ore extension C[y][x; o, 6] 
where o (a) = a and 6(a) = 0 for all a € C and where o(y) = gy and 6(y) = 1. 


We will simply refer to a g-deformed Weyl algebra as a g-Weyl algebra. 
A q-Weyl algebra is thus an algebra over C with two generators, x and y, such 
that xy = gyx + 1. The Weyl algebra is the special case when g = 1. 

If A is any algebra over a ring R and P, Q are two commuting elements of A 
then we say that P, Q are algebraically dependent if f(P, Q) = 0 for some non- 
zero polynomial f(s, t) € R[s, ¢] in two central indeterminates s and t over R. The 
polynomial f is called an annihilating polynomial. 


3 Algebraic Dependence 


In a series of papers in the 1920s and 30s [2-4], Burchnall and Chaundy studied 
the properties of commuting pairs of ordinary differential operators. The following 
theorem is essentially found in their papers. (Their paper is somewhat imprecise on 
formal details.) 

Theorem 1 Let P = >“), piD' andQ = pee q; D/ be two commuting elements 
of T with constant leading coefficients. Then there is a non-zero polynomial f (s, t) 
in two commuting variables over C such that f(P, Q) = 0. Note that the fact that 
P and Q commute guarantees that f (P, Q) is well-defined. 
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Burchnall’s and Chaundy’s work rely on analytical facts, such as the existence 
theorem for solutions of linear ordinary differential equations. However, it is possible 
to give algebraic proofs for the existence of the annihilating polynomial. This was 
done later by authors such as Amitsur [1] and Goodearl [5, 8]. Once one casts 
Burchnall’s and Chaundy’s results in an algebraic form one can also generalize them 
to a broader class of rings. 

More specifically, one can prove Burchnall’s and Chaundy’s result for certain 
types of Ore extensions. We cite an important early result by Amitsur as an example. 

Amitsur [1, Theorem 1] (following work of Flanders [7]) studied the case when 
R is a field of characteristic zero and 6 is an arbitrary derivation on R. He obtained 
the following theorem. 


Theorem 2 Let K be a field of characteristic zero with a derivation 5. Let F denote 
the subfield of constants. Form the differential operator ring S = K[x; id, 5], and 
let P be an element of S of degree n. Denote by by F[P] the ring of polynomials 
in P with constant coefficients, F[P]| = {Yio b; Pi | bj; € F}. Then Cs(P) is a 
commutative subring of S and a free F[P|-module of rank at most n. 


The next corollary can be found in [1, Corollary 2]. 


Corollary 1 Let P and Q be two commuting elements of K[x; id, 6], where k is a 
field of characteristic zero. Then there is a nonzero polynomial f (s, t), with coeffi- 
cients in F, such that f (P,Q) = 0. 


Proof Let P have degree n. Since Q belongs to Cs(P) we know that 1, Q,..., Q” 
are linearly dependent over F[P] by Theorem 2. But this tells us that there are 
elements ¢o(P), ¢)(P), ...¢n(P), in F[P], of which not all are zero, such that 


po(P) + di(P)Q +--+ + bn(P)Q” = 0. 


Setting f(s, 1) = 57/9 ¢;(s)t! the corollary is proved. Oo 


4 The Determinant Construction 


The cited result by Amitsur is an existence proof, but Burchnall and Chaundy also 
gave an algorithm for computing the annihilating polynomial in the case of differ- 
ential operators. In this section we will describe this algorithm for the similar case 
of the g-Wey] algebra. 

Let P = D¥)_9 pi(y)x' and Q = D°""_9qj(y)x/ be commuting elements in a 
q-Wey] algebra. For e = 0, 1, ...m — 1 compute 


n+m—1 


x°(P—s)= >> piely.s)x! 
i=0 
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and similarly, for / = 0, 1,...2 — 1 compute 
n+m—1 
H(Q-1N= Do ayy tx. 
j=0 


Here the computation is done in the ring C[y][x; 0, d][s, t], the polynomial ring 
in two central indeterminates over C[y][x; 0, 5]. Form a square matrix of size n + m 
with p;,- as the element in row e + | andcolumnn + m — i. We let q;,; be the matrix 
element in row / + m+ 1 and column n + m — j. The determinant of this matrix 
will be called the eliminant (of P and Q) and denoted Ap _g. 

De Jeu, Svensson and Silvestrov [6] prove the following theorem. 


Theorem 3 Let K be a field, and q an element of K such that > qi £0 for 
all natural numbers N. (Note that such a q only exists if K is an infinite field.) Let 
Ap,g denote the eliminant constructed above. (A polynomial in y, s and t.) Write 
Apo =>. fils, t)y'. Then 


(i) at least one of the f; are non-zero; 


Gi) f;(P, Q) =0 for alli. 


In the case when K = R and gq = 1, this is the same method as Burchnall and 
Chaundy describe. 


Example 4 That a condition on g is needed in the theorem can be seen as follows: 
if g is a primitive mth root of unity, where n > 1, then x” and y” both belong to the 
center of C[y][x; o, 6]. But there is no non-zero polynomial over C that annihilates 
x” and y”. 
Example 5 We describe an example of the eliminant when g = 1. Let P = yx and 
QO = y’x”. Then 

x°(P —s)=yx-s, 


x'(P—s)= yi se (1 —s)x, 


and 
x"(O —fth= a —t. 
Thus 
0 y -s 
Apg=|y U-s) 0) =(¢+s(—s))y?. 
y O -t 


Since indeed QO? + P(1 — P) = 0, this is consistent with Theorem 3. 


Example 6 We can describe a similar example in the g-Weyl algebra. Set P = yx 
and Q = y’x?, again. One can check that 
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PQ=QP=q yx + (q+ Dy'x’. 


The eliminant becomes 


0 y -s 
Ap.o=|ay—s) 0|=@tt+s(—s))y. 
y> O -t 


As expected we recover the result of the previous example by setting g = 1. 


5 Generalisation to Ore Extensions 


The eliminant construction can be generalised to any Ore extension in an obvious 
way. This was done by Larsson in [11] and described in more detail in [15]. 

To elaborate slightly, suppose that P and Q are commuting elements of some 
Ore extension R[x; o, 6] with P having degree n and Q having degree m. For e = 
0, 1,...m— 1 compute 


n+m—1 


x*(P—s)= )) Piels))x", 
i=0 


and similarly, for / = 0, 1,...2 — 1 compute 
n+m—1 
*(O-1)= DD ajax’. 
j=0 


Then use the coefficients p;,- and q;,) to form the determinant like before. 
The question whether the eliminant still computes an annihilating polynomial is 
answered in the following theorem, found in [15]. 


Theorem 7 [f P and Q are commuting elements of R[x; 0, 6] then 
f(s, t) = Ap.o(s, t) 


is a polynomial in two commuting variables such that f (P, Q) = 0. If R is an integral 
domain and o is an injective function then Ap,o(s, t) is a non-zero polynomial. 


We will illustrate the eliminant construction in a special class of Ore extensions. 
They will be of the form K[y][x; o, 6] where K is a field, o and 6 are K-linear and 
deg, (a(y)) > 1. This is closely similar to the construction of the g-Wey] algebra. 
Instead of the relation xy = qgyx + 1,wenowhavearelationxy = f(y)x + 1, where 
Ff (y) is some polynomial of degree larger than 1. 
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Example § Consider the case when o(y) = y? and 6(y) = 1. Then P = yx and 
Q = y*x* commute. The eliminant becomes 


0) y -s 
Apg=|y? (-s) 0] =(@+s(1—s))y’. 
y oO -t 


It is true that Q + P — P* = 0 so the result is consistent with Theorem 7. 


Example 9 Take R = K[y] as before, and set o(y) = y? + 1 and 6(y) = 0. Then 
P = y’x —1and Q = (yx)? commute. We find that Q = y?(y” + 1)*x? and that 
the eliminant is 


0 y> 1-s 
Ap.o = (y? + 1)? (_—s) Ol=C¢+—- 1?) y?(y? a 1). 
y(y2+1? 0 a 


We note that in the preceding examples we actually found an annihilating poly- 
nomial over K, not just K[y]. In [14] one can find it proven that such a polynomial 
always exist for commuting elements P and Q. 


Theorem 10 Let K bea field. Leto be an endomorphism of Ky] such that o(y) = 
D(), where deg(p) > 1, and let 6 be a o-derivation. Suppose that o (a) = a and 
d6(a) = Oforalla € k. Let P, Q be two commuting elements of K[y][x; 0, 6]. Then 
there is anonzero polynomial f(s, t) € K[s,t] such that f(P, Q) = 0. 


One might hope that the eliminant construction might allow us to always compute 
an annihilating polynomial in the same way as in Theorem 3. We conjecture that this 
is true but have not been able to prove it. 

We finish with an example where we do not have a theorem like Theorem 3. 


Example 11 Consider the g-Weylalgebra withg = —1.Then P = y?x? and Q = x4 
commute. The eliminant becomes 


Ap,o(s, t) = —(s? — ty*)’. 


This is still an annihilating polynomial over K[y] but it does not give us an annihi- 
lating polynomial over K, which is expected since no such polynomial exists. 
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Centralizers and Pseudo-Degree Functions 


Johan Richter 


Abstract This paper generalizes a proof of certain results by Hellstrém and 
Silvestrov (J Algebr 314:17-41, 2007, [8]) on centralizers in graded algebras. We 
study centralizers in certain algebras with valuations. We prove that the centralizer 
of an element in these algebras is a free module over a certain ring. Under further 
assumptions we obtain that the centralizer is also commutative. 


Keywords Ore extensions - Algebraic dependence - Commutative subrings 


1 Introduction 


The British mathematicians Burchnall and Chaundy studied, in a series of papers in 
the 1920s and 30s [3-5], the properties of commuting pairs of ordinary differential 
operators. The following theorem is essentially found in their papers. 


Theorem 1 Let P = >“, p;D' andQ = Yj -0 q;D/ be two commuting elements 
of T with constant leading coefficients. Then there is a non-zero polynomial f (s, t) 
in two commuting variables over C such that f (P, Q) = 0. Note that the fact that 
P and Q commute guarantees that f (P, Q) is well-defined. 


The result of Burchnall and Chaundy was rediscovered independently during the 
70s by researchers in the area of PDEs. It turns out that several important equations 
can be equivalently formulated as a condition that a pair of differential operators 
commute. These differential equations are completely integrable as a result, which 
roughly means that they possess an infinite number of conservation laws. In fact 
Theorem | was rediscovered by Kricherver [9] as part of his research into integrable 
systems. 

To state some generalizations of Burchnall’s and Chaundy’s result we shall recall 
a definition. 
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Definition 1 Let R be a ring, o an endomorphism of R and 6 an additive function, 
R — R, satisfying 
d(ab) = ao (a)d(b) + d(a)b 


for alla, b € R. (Such 6:s are known as o -derivations.) The Ore extension R[x; 0, 6] 
is the polynomial ring R[x] equipped with a new multiplication such that xr = 
o(r)x + 6(r) for all r € R. Every element of R[x; o, 5] can be written uniquely as 
>), ax! for some a; € R. 

Ifo = idthen R[x; idp, 5] is called a differential operator ring. If P = \_, aix', 
witha, 4 0, we say that P has degree n. The degree of the the zero element is defined 
to be —oo. 


The ring of differential operators studied by Burchnall and Chaundy can be taken 
to be the Ore extension T = C™(R, C)[D; id, 5], where 6 is the ordinary derivation. 
In a paper by Amitsur [1] one can find the following theorem. 


Theorem 2 Let K be a field of characteristic zero with a derivation 5. Let F denote 
the subfield of constants. (By a constant we mean an element that is mapped to 
zero by the derivation.) Form the differential operator ring S = K[x; id, 5], and let 
P be an element of S of degree n > 0. Set F[P] = (Ro bP! | bj € F }, the 
ring of polynomials in P with constant coefficients. Then the centralizer of P is a 
commutative subring of S and a free F[P|-module of rank at most n. 


Later authors have found other contexts where Amitsur’s method of proof can 
be made to work. We mention an article by Goodearl and Carlson [6], and one by 
Goodearl alone [7], that generalize Amitsur’s result to a wider class of rings. The 
proof has also been generalized by Bavula [2], Mazorchouk [10] and Tang [11], 
among other authors. As a corollary of these results, one can recover Theorem 1. 

This paper is most directly inspired by a paper by Hellstrém and Silvestrov [8], 
however. Hellstr6m and Silvestrov study graded algebras satisfying a condition they 
call /-BDHC (short for “Bounded-Dimension Homogeneous Centralizers”’). 


Definition 2 Let K be a field, £ a positive integer and S a Z-graded K -algebra. The 
homogeneous components of the gradation are denoted S,,, form € Z. Let Cen(n, a), 
forn € Z anda € S, denote the elements in S, that commute with a. We say that 
S has €-BDHC if for all n € Z, nonzero m € Z and nonzero a € S,,, it holds that 
dimx Cen(n, a) < £. 


Hellstrém and Silvestrov apply the ideas of Amitsur’s proof. They need to modify 
them however, especially to handle the case when ¢ > I. 

To explain their results further, we introduce some more of their notation. Denote 
by z,, the projection, defined in the obvious way, from S' to S,. Hellstrém and 
Silvestrov define a function x : A \ {0} ~ Z by 


x(a) = max{n €Z|m,(a) £0}, 


and set x (0) = —oo. Set further 7 (a) = 7a) (a). 
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Now we have introduced enough notation to state the relevant results. The fol- 
lowing result is the main part of Lemma 2.5 in their paper. 


Theorem 3 Assume S is a K-algebra with |-BDHC and that there are no zero 
divisors in S. If a € S \ So is such that x(a) = m > 0 and (a) is not invertible in 
S, then there exists a finite K[a]|-module basis {b,, ..., by} for the centralizer of a. 
Furthermore k < ml. 


The reason they refer to it as a lemma is that their main interest is in the following 
corollary of this result, (which is proved the same way as Corollary | in this paper). 


Theorem 4 Let K be afield and assume the K -algebra S has |-BDHC and that there 
are no zero divisors in S. If a € S \ So and b € S are such that ab = ba, x(a) > 0 
and (a) is not invertible in S, then there exists a nonzero polynomial P in two 
commuting variables with coefficients from K such that P(a, b) = 0. 


Theorem 4 is directly analogous to Theorem 1. 
Hellstrém and Silvestrov also have a result asserting that certain centralizers are 
commutative. Their proof can be made to work in the case when A has 1-BDHC. 


Theorem 5 Assume the K -algebra S has \-BDHC and that there are no zero divisors 
in S. Ifa € S \ So satisfies x(a) = m > 0 and m(a) is not invertible in S, then there 
exists a finite K [a]-module basis {b,, ..., b,} for the centralizer of a. The cardinality, 
k, of the basis divides m. Furthermore the centralizer of a is commutative. 


It shall be the goal of this paper to generalize the results we have cited from [8]. 


1.1 Notation and Conventions 


Z, will denote the integers. 

If R is a ring then R[x, x2,...X,] denotes the ring of polynomials over R in 
central indeterminates x], X2,...,Xn- 

All rings and algebras are assumed to be associative and unital. 

Let R be a commutative ring and S$ an R-algebra. Two commuting elements, 
P,q € S, are said to be algebraically dependent (over R) if there is a non-zero 
polynomial, f(s, t) € Ris, t], such that f(p, gq) = 0, in which case f is called an 
annihilating polynomial. 

If S is aring and a is an element in S, the centralizer of a, denoted Cs(a), is the 
set of all elements in S$ that commute with a. 

By K we will always denote a field. 
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2 Centralizers in Algebras with Degree Functions 


Upon reading the proofs in [8] closely it turns out that they are based upon certain 
properties of the function x they define. We shall axiomatize the properties that are 
needed to make their proof work. 


Definition 3 Let K be a field and let S be a K-algebra. A function, x, from S$ to 
ZU {—o0} is called a pseudo-degree function if it satisfies the following conditions: 


x(a) = —oo iff a = 0, 

x(ab) = x(a) + x(b) foralla, be S, 
x(a + b) < max(x (a), x(b)), 
x(a + b) = x(a) if x(0) < x(a). 


This is essentially a special case of the concept of a valuation. 
We also need a condition that can replace /-BDHC. We formulate it next. 


Definition 4 Let K be a field and S a K-algebra with a pseudo-degree function, 
x, and let @ be a positive integer. A subalgebra, B C A, is said to satisfy condition 
D(€) if x(b) = 0 for all non-zero b € B and if, whenever we have £ + | elements 
b,,..., bi41 € B, all mapped to the same integer by x, there exista),...,a@e41 € K, 


not all zero, such that x (2 a;b;) < x(b,). 


Remark I Note that the requirement that a), ..., a), are mapped to the same integer 
by x excludes the possibility that they are equal to 0. 


Remark 2 Suppose that S is a K-algebra and a € S is such that Cs(a) satisfies 
condition D(£) for some £. If b is an invertible element then y(b~!) = —x(b). So 
all invertible elements of Cs(a) must be mapped to zero by x. In particular the 
non-zero scalars are all mapped to zero by x. 


Lemma 1 Suppose that S is an K-algebra and x is a pseudo-degree function on S 
that maps all the non-zero scalars to zero. Then ifa, b € S are such that x (b) < x(a), 
the identity 

x(at+ b) = x@) (1) 


holds. 


Proof On the one hand we find x (a + b) < max(x (a), x(b)) = x (a). On the other 
hand x(a) = x(a+b—b) < max(x(a +b), x(b)) Since x(b) < x(a) we must 
have x(a) < x(a+)). 


We now proceed to prove an analogue of Theorem 3, using just the existence of 
some pseudo-degree function and the condition D(¢). 
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Theorem 6 Let K be a field and let S be a K-algebra. Suppose S has a pseudo- 
degree function, x. 

Let a be an element of S, withm = x(a) > 0, such that Cs(a) satisfies condition 
D(é) for some positive integer €. Then Cs(a) is a free K[a]-module of rank at most 
fm. 


Proof Construct a sequence b;, b, ... by setting b} = 1 and choosing by41 € Cs(a) 
such that x (by+1) is minimal subject to the restriction that bg. does not lie in the 
K [a]-linear span of {b,, ... , bg}. We will show later in the proof that such a sequence 
has at most /m elements. 

We first claim that 


k 
zi (x on) = max(x (i) + x(bi)), (2) 
i=1 - 


for any ¢1,...¢% € K[a]. We show this by induction on n = maxj;<x(x(¢;) + 
x (b;)). It is clear that the left-hand side of (2) is never greater than the right-hand 
side. When n = —oo Eq. (2) holds since in that case all ¢; = 0. If n = 0, Equation 
(2) holds since x (b) > 0 for all non-zero b € Cs(a). That x (b) => 0 for all non-zero 
b in Cs(a) also means that no value of n between —oo and 0 is possible. 

For the induction step, assume (2) holds when the right-hand side is strictly 
less than n. To verify that it holds for n as well, we can assume without loss 
of generality that x (¢,) + x (by) =n, since if x(@jb;) <n for some term $b; 
we can drop it without affecting either side of (2), by Lemma |. If gy ¢ K 
then x (¢,) = 0, by Remark 2, and thus x (by) =n. By the choice of by it then 
follows that yy ;b;) =n, as otherwise mr o;b; would have been picked 
instead of b;,. If dj ¢ K, then x (by) <n and thus x(b;) <n fori = 1,...k. Let 
r},...,1 € K and &,...,& € K[a] be such that ¢; = a&; + 7; fori =1,...,k. 
We have ee rjb;) <n and thus by Lemma | and the assumptions on x we get 


k k k k k 
x (Zs) =x (Ze Zn] =x (Ze) =m-+ x (Ee) ‘ 
i=l i=l i=l i=l i=l 


We also have that max;<;(x (@;) + x(bi)) = m + maxj<x(x (&) + x (b;)). By the 
induction hypothesis 


k 
x > st) = max(x (&) + xi), 
i=l ~ 


which completes the induction step. 

We now show that if x (b;) = x(b;) for some i < j then j —i </. Suppose 
b,,..., bj4;1 all are mapped to zero by x. Then there exists a@j,..., 41, not all 
zero, such that 
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I+1 
x 3 an) <0 
i=1 


which is impossible since ya i=] 1 aD; € Cs(a). 


Suppose now instead that b;, ..., b;+; are all mapped to the same positive integer, 
q, by x. Then there exists aj,...,a@ 4); € K, not all zero, such that 
j4l 


But this contradicts (2). 

It remains only to show that the sequence (b;) contains only /m elements. We will 
prove that every residue class (mod m) can only contain at most / elements. Suppose 
to the contrary, that we had elements c),..., c;+1, belonging to the sequence (b;) 
and all satisfying that x(c;) =n (modm). Set k = max;<j</41(x(c;)) and define 


yi =a a . Then x(y;c;) = k, for alli € {1,...,/-+ 1}, which implies that there 
exists @,...,Q/41 € K, such that 


jtl 


SS aivici <k 
i=j 


But this once again contradicts (2). 


We can also prove a result on the algebraic dependence of pairs of commuting 
elements. 


Corollary 1 Let S be a K-algebra with a pseudo-degree function, x. Leta € S 
be such that Cs5(a) satisfies Condition D(1) for some 1 > 0. Let b be any element in 
Cs (a). Then there exists anonzero polynomial P(s,t) € K[s, t] such that K (a, b) = 
0. (Note that K (a, b) is well-defined when a, b commute.) 


Proof Since Cs(a) has finite rank as a K [a]-module the elements b, b?,...cannotall 
be linearly independent over K [a]. Thus there exists f\(x),..., fx(x) € K[x], notall 
zero, such that 5“\_, fi(a)b' = 0. Then P(s,t) = >to fi(s)t’ = Ois a polynomial 
with the desired property. 


We can also prove a result asserting that certain centralizers are commutative, 
though for that we need to assume that Cs(a) satisfies condition D(1). 


Theorem 7 Let K be a field and suppose S is a K-algebra. Let S have a pseudo- 
degree function, x. If a € S satisfies x(a) =m > 0 and Cs(a) satisfies condition 
D(A) then: 
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I. Cs(a) has a finite basis as a K[a]-module, the cardinality of which divides m. 
2. Cs(a) is a commutative algebra. 


Proof By Theorem 6 it is clear that there is a subset H of {1,..., m} and elements 
(b;)icey such that the b; form a basis for Cs(a). By the proof of Theorem 6 it is 
also clear that x (b;) # x(b;) if i # j. Without loss of generality we can assume 
x (b;) =i for alli € H. We can map H into Z,, in a natural way. Denote the image 
by G. We want to show G is a subgroup, for which it is enough to show that it is 
closed under addition. 

Suppose g, h € G. Thereexistsi, 7 € H,withi = g (modm) andj = h(modm). 
We can write bjb; = Dien oxb,, for some {b;}. It follows that 


Sths=it j =x (bjibj) = max(x (Ge) + x (Ok) = Xe) = k (mod m) 


forsome k € H. 

Since G is a subgroup of Z,,, it is clear that the cardinality of G, which is also the 
cardinality of H, must divide m. 

G is cyclic. Let g be a generator of G. Consider the algebra generated by b; and 
a, where i = g (mod). It is a commutative algebra and a sub-K -vector space of 
Cs(a). Denote it by E. If c is any element of Cs(a) we can write c = e + f, where 
e € E and x(f) < mi, since if x(c) > mi then there exists k < m and j € N such 
that x (a/b*) = x(c) and thus there exists w € K such that x(c — aa/b*) < x(c). 

Thus the quotient Cs(a)/E is finite-dimensional. Each f € K[a] gives rise to 
an endomorphism on Cs5(a)/E, by the action of multiplication by f. Since K [a] is 
infinite-dimensional and the endomorphism ring of Cs(a)/E is finite-dimensional, 
there is some nonzero ¢ € K[a] that induces the zero endomorphism. But this means 
that dc € E for any c € Cs(a). 

Now let c;, cz be two arbitrary elements of Cs(a). Since E is commutative, and 
everything in Cs(a) commutes with ¢, it follows that 


2 2 
P°C\C2 = OC) + OC2 = GC.» OC) = P'C2Cy. 


Since Cs5(a) is a domain it follows that cjc2 = coc, and thus that Cs(a) is commu- 
tative. 


3 Examples 


Theorems 3, 4 and 5 follow from our results combined with Lemmas 2.2 and 2.5 in 
[8]. But our results can also be applied in certain situations that are not covered by 
the results in [8]. 


Proposition 1 Let K be a field. Set R = K[y], let o be an endomorphism of R 
such that s = deg, (o(y)) > 1 and let 6 be a o-derivation. Form the Ore extension 
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S= R[x; 0,6]. Ifa € S\ K then Cs(a) is a free K[a]-module of finite rank and a 
commutative subalgebra of S. 


Proof If a € K[y]\ K then Cs(a) = K[y] and the claim is true. So suppose that 
a € K[y]. We shall apply Theorem 7. To do so we need a pseudo-degree function. 

The notion of the degree of an element in S with respect to x was defined in the 
introduction of this article. Denote the degree of an element b by x (b). It is easy to 
see that x satisfies all the requirement to be a pseudo-degree function. We proceed 
to show that Cs(a) satisfies condition D(1). Certainly it is true that x (b) > 0 for all 
nonzero b € Cs(a). 

Let b be a nonzero element of S$ that commutes with a, such that x(b) =n. 
Suppose x(a) = m. By equating the highest order coefficient of ab and ba we find 
that 

Amo" (bp) = bya” (An), (3) 


where a,, and b,, denote the highest order coefficients of a and b, respectively. (Recall 
that these are polynomials in y.) We equate the degree in y of both sides of (3) and 
find that 


deg, (am) +s” deg,, (bn) = deg, (bn) +5" deg,, (am), 


which determines the degree of b,, uniquely. It follows that the solutions of (3) form 
a K-sub space of K[y] that is at most one-dimensional. This in turn implies that 
condition D(1) is fulfilled. 

We have now verified all the hypothesis necessary to apply Theorem 7. 
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Crossed Product Algebras for Piece- Wise 
Constant Functions 


Johan Richter, Sergei Silvestrov, Vincent Ssembatya 
and Alex Behakanira Tumwesigye 


Abstract In this paper we consider algebras of functions that are constant on the sets 
of a partition. We describe the crossed product algebras of the mentioned algebras 
with Z. We show that the function algebra is isomorphic to the algebra of all functions 
on some set. We also describe the commutant of the function algebra and finish by 
giving an example of piece-wise constant functions on a real line. 
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gebra 


1 Introduction 


An important direction of investigation for any class of non-commutative algebras 
and rings, is the description of commutative subalgebras and commutative subrings. 
This is because such a description allows one to relate representation theory, non- 
commutative properties, graded structures, ideals and subalgebras, homological and 
other properties of non-commutative algebras to spectral theory, duality, algebraic 
geometry and topology naturally associated with commutative algebras. In represen- 
tation theory, for example, semi-direct products or crossed products play a central 
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role in the construction and classification of representations using the method of 
induced representations. When a non-commutative algebra is given, one looks for 
a subalgebra such that its representations can be studied and classified more easily 
and such that the whole algebra can be decomposed as a crossed product of this 
subalgebra by a suitable action. 

When one has found a way to present a non-commutative algebra as a crossed 
product of a commutative subalgebra by some action on it, then it is important to 
know whether the subalgebra is maximal commutative, or if not, to find a maximal 
commutative subalgebra containing the given subalgebra. This maximality of a com- 
mutative subalgebra and related properties of the action are intimately related to the 
description and classification of representations of the non-commutative algebra. 

Some work has been done in this direction [2, 4, 6] where the interplay between 
topological dynamics of the action on one had and the algebraic property of the 
commutative subalgebra in the C*—crossed product algebra C(X) x Z being max- 
imal commutative on the other hand are considered. In [4], an explicit description 
of the (unique) maximal commutative subalgebra containing a subalgebra A of C* 
is given. In [3], properties of commutative subrings and ideals in non-commutative 
algebraic crossed products by arbitrary groups are investigated and a description of 
the commutant of the base coefficient subring in the crossed product ring is given. 
More results on commutants in crossed products and dynamical systems can be found 
in [1, 5] and the references therein. 

In this article, we take a slightly different approach. We consider algebras of func- 
tions that are constant on the sets of a partition, describe the crossed product algebras 
of the mentioned algebras with Z and show that the function algebra is isomorphic 
to the algebra of all functions on some set. We also describe the commutant of the 
function algebra and finish by giving an example of piece-wise constant functions 
on a real line. 


2 Definitions and a Preliminary Result 


Let A be any commutative algebra. Using the notation in [4], we let y : A > A be 
any algebra automorphism on A and define 


Axy Z:={f:Z— A: f(n) = 0 except for a finite number of n}. 


It can be proved that A xy, Z is an associative C—algebra with respect to point-wise 
addition, scalar multiplication and multiplication defined by twisted convolution, * 
as follows; 

(f¥ gin) =>) f®.w(g(n—%)), 


keZ 


where y* denotes the k—fold composition of y with itself for positive k and we use 
the obvious definition for k < 0. 
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Definition 1 A x,, Z as described above is called the crossed product algebra of 
A and Z under y. 


A useful and convenient way of working with A xy Z, is to write elements f, g € 
A xy Zin the form f = >° <7 fxd" and g = > .-7 8m” where fn = f(n), 8m = 


g(m) and 
1, ifk=n, 
é"(k) = ; 
0, ifk An. 


Then addition and scalar multiplication are canonically defined and multiplication 
is determined by the relation 


(fn8") * (8m8") = fn¥" (Bm)0"™™, (1) 


where m,n € Zand fn, 8m € A. 


Definition 2 By the commutant A’ of A in A xy Z we mean 
A':={fEeAxy,Z: fg = gf for every g € A}. 


It has been proven [4] that the commutant A’ is commutative and thus, is the 
unique maximal commutative subalgebra containing A. For any f,g € A xy Z, 
that is, f = >) ez fn6” and g = >) ,¢7 8m5”, then fg = gf if and only if 


vr: > fi?” (8+—m) = ~ BuO from) 


neZ meZ 


Now let X be any set and A an algebra of complex valued functions on X. Let 
o : X > X beany bijection such that A is invariant under o ando~!, that is for every 
he A, hoo € Aandhoo! € A. Then (X, c) isa discrete dynamical system and 
o induces an automorphism o : A — A defined by, 


(f) = foo. 


Our goal is to describe the commutant of A in the crossed product algebra A x5 Z 
for the case where A is the algebra of functions that are constant on the sets of a 
partition. First we have the following results. 


Definition 3 For any nonzero n € Z, we set 
Sep", (X) ={x eX |aAheA : h(x) Fo" (h)(x)}. (2) 


The following theorem has been proven in [4]. 


Theorem 1 The unique maximal commutative subalgebra of A ¥3 Z that contains 
A is precisely the set of elements 


78 J. Richter et al. 


4 ={¥ fa" Loraitn 2: Alene = 0} (3) 


neZ 


We observe that since o(f) = fooa7!, then 
&(f) =6(foo')=(foo joo '!= foo”, 


and hence for every n € Z,a"(f) = f oo". Therefore, by taking X = R and A as 
the algebra of constant functions on X we have: for every x € X and every h € A, 


a" (h)(x) = hoa "(x) =h(o "(x)) =h@), 


since h is a constant function. It follows that in this case Sep", (X) = ¥. Therefore 
in this case, A’ = A x; Z. 


3 Algebra of Piece-Wise Constant Functions 


Let X be any set, J a countable set and P = {X; : j € J} bea partition of X; that 
is X = U,e;X, where X, #4 Gand X,NX, =Bifr Ar’. 
Let A be the algebra of piece-wise constant complex-valued functions on X. That 
is 
A={heC*: foreveryjeJ: h(X;) = {e;}}. 


Let o : X — X bea bijection on X. The lemma below gives the necessary and 
sufficient conditions for (X, 0) to be a dynamical system. 


Lemma 1 The following are equivalent. 


1. The algebra A is invariant under o and o™!. 
2. For everyi € J there exists j € J such that o(X;) = Xj. 


Proof We recall that the algebra A is invariant under o if and only if for every 
heA, hoo €A. 
Obviously, if for every i € J there exists a unique j € J such that 0(X;) = Xj, 
then 
(hoo)(X;) = h(o(Xj)) = h(Xj) = {ej}. 


Thushoo €A. 
Conversely, suppose A is invariant under o but 2. does not hold. Let x1, x2 € 
X; and X,, X, € P such that o(x,) € X; and o(x2) € X;. Leth : X — C be the 


function defined by 
1 if X;, 
oe -| itx € 


0 otherwise. 
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Then h € A. But h oo(x,) = 1 andhoo(x2) = 0. Thus h ¢ A, which contradicts 
the assumption. Oo 


The following lemma asserts that any bijection 02 : X — X that preserves the 
structure of a partition essentially produces the same algebra of functions. 


Lemma 2 Let P; ={X; : j €¢ J} andP2 = {Y¥; : j € J} be partitions of the sets 
X and Y respectively, and let 


Ay ={heC*: foreveryjeJs: h(X;) = {e;}}, 


and 
Ay ={h eC’: forevery j ¢ J: h(¥;) = {dj}. 


Then Ax is isomorphic to Ay. 


Proof Choose points x; € X and y; € Y such that x; € X; if and only if y; € Y; Vi € 
J and let uw : Ax — Ay be a function defined by 


UP) = f@)ify eV), Vie J. (4) 


It is enough to prove that jz is an algebra isomorphism. 


e Let f, g € Ax and leta, 6 € C. Thenif y € Y, then y € Y; forsomei € J, there- 
fore, 


U(af + Bg)(y) = (@f + Bg)@i) 
= af (x;) + Bg (xi) 

= an(f)(y) + Bu(g)(y) 

= lau(f) + Bu(g)I1Q). 


Therefore jz is linear since y was arbitrary. 
e Forevery f,g © Ay andye€ Y (ve ¥;j), 


UC fg)(y) = (fg) i) 
= f (xis) 
= UP MU(E)(y) 
= [H(fe(g)IQ). 
Thus j is a multiplicative homomorphism. 
e Now, suppose f, g € Ay such that f ¢ g. Then there exists i € J such that 
St (xi) 4 g(x), x; € X;. Therefore, if y € Y;, 


Mf) = fi) F gi) = U(g)y. 
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Therefore jz is injective. 
e Finally, suppose h € Ay and let f € Ay be defined by f(x) = h(y;). If y € Y, 
then y € Y; forsomei € J, and hence, 


h(y) =hQi) = f@) = fd) = uO). 


It follows that jz is onto and hence an algebra isomorphism. 
oO 


Theorem 2 Let P; = {X;: j € J} and P2 = {Y;: j € J} be partitions of two sets 
X and Y and Ay and Ay be algebras of functions that are constant on the sets of the 
partitions P, and P2 respectively. Leta,: X — X andoz: Y = Y be bijections such 
that Ax is invariant under o; (and a) and Ay is invariant under o> (and ao) and 
that o\(X;) = Xj whenever 02(¥;) = Y; foralli, j € J. Suppose 6, : Ax > Ax is 
the automorphism on Ax induced by o,, and 62: Ay > Ay is the automorphism 
on Ay induced by o>. Then 


O20 =[looy. (5) 
where kt is given by (4). Moreover, for every n € Z, 
On" om = "oo". (6) 
Proof Let y € X such that y € Y; for somei € J. Then for every f € A, 


(62 0 w)(f)(y) = (uf) 0 a7 '(y) 
= (uf)(oz '(y)) 

= f(a; (x) 

=(f oo, )@) 

= w(f oo; ')(y) 

= wlaA(f)1Q) 
=[woa](f)). 


Since y is arbitrary, we have 
(620)(f)u = wo aif) 
for every f € A. And since f is arbitrary, 
O20 = [looy. 
Now from (5), we have 


6p” 0 fb = G2 0 (G0 pW) = G2 0 ("OG}) = (FO) 0G) = (MOG}) 0G) = WOG}”. 
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Therefore the relation (6) holds for n = 2. 
Now suppose the relation (6) holds for k. Then: 
62+! op = 63 0 (Gx ou) = G2 0(Modi*) = (Mop)odi =(Mod)od* =poa*. 


Therefore, from the induction principle, 


~n ~n 
02 O*- = Loo; 


Remark 1 From Theorem 2 above, we get two nice results. The first is thatifP, = P, 
are partitions of X and o;,02: X — X are bijections on X which preserve the 
structure of the partition, they will give rise to the same automorphism. That is, 
suppose P; = {X; : j € J} is a partition of X and 0), 02: X — X are bijections 
on X such that, if o,(X;) = X;, then o2(X;) = Xj, foralli, j ¢ J. Leta: A> A 
be the automorphism on induced by o, that is, for every h € A, 


o(h)=hoa!, 


Then for every f € A, 
o1(f) = o2(f). 


This is given by the fact that if P); = P2, then in (5), we can take w = id. 
The second is the following important theorem. 


Theorem 3 Let P, = {X;: j € J} and P2 = {Y;: j € J} be partitions of two sets 
X and Y and Ax and Ay be algebras of functions that are constant on the sets of the 
partitions P, and P2 respectively. Leto,: X — X andoz: Y => Y be bijections such 
that Ax is invariant under o; (and o,') and Ay is invariant under o> (and o>) and 
that o(X;) = X ; whenever 03(Y;) = Y; for alli, j € J. Suppose 6; : Ay > Ax is 
the automorphism on Ax induced by o,, and 6): Ay — Ay is the automorphism 
on Ay induced by 02. Then the crossed product algebras A xg, Zand A xg, Z are 
isomorphic. 


Proof We need to construct the an isomorphism between the crossed product algebras 
Ax xg, Zand Ay x, Z. Using the notation in [4], we let f := ned fn8”" be an 
element in Ay x, Z. Define a function uw: Ay xg, Z— Ay X@, Z be defined by 


jt (= rat) = Do ul fn)8%, (7) 


neZ neZ 


where ju is defined in (4). Then, since yz is an algebra isomorphism, it is enough 
to prove that ji is multiplicative. To this end, we let f := >) <7 fnd} and g := 
DD nez 85" be arbitrary elements in. Ay =, Z, then we prove that / is multiplicative 
on the generators /,,4/ and g,,d/' respectively. Using (1) we have 


82 J. Richter et al. 


A fnd}) * (8mS1")) = LC fnG1" (8m)5t-") 
= M(fnG1" ($m))537" 

= [MC fn) M(G1" (fin))155” 

= W(fn)G2" (MC fin))557°" by (6) 

= [(fnd3) * (fin 53')- 


Therefore j2 is multiplicative on the generators f,,6” and since yu is linear, it is 
multiplicative on the elements f = >) <7 fnd" € Ax x Z. oO 


Remark 2 In Lemma | we proved the necessary and sufficient condition on a bijec- 
tiono : X — X such that the algebra Ay is invariant under o, that is, foreveryi € J 
there exists j € J such that o(X;) = Xj; where the X; form a partition for X. From 
this, it can be shown that A is isomorphic to C’, where by C’ we denote the space of 
complex sequences indexed by J. This can be done by constructing an isomorphism 
between Ax and C/ via as follows. 

Let t : J ~ J be a map such that t(i) = j is equivalent to 0(X;) = X; for all 
i, j € J. Then 7 is a bijection that plays the same role as o2 in Lemma 2. Therefore, 
using the same Lemma, we deduce that the algebra A is isomorphic to C’. In Theo- 
rem 3, we have shown a method of constructing an isomorphism between the crossed 
product algebras Ay xg, Zand Ay x¢, Z, when Ay and Ay are isomorphic. It fol- 
lows that the crossed product algebra Ay x, Z is isomorphic to C’ x; Z, where T 
follows the same definition as o. 


In the next section we describe the commutant of our algebra Ay in the crossed 
product algebra Ay x5 Z. 


3.1 Maximal Commutative Subalgebra 


We take the same partition P = Uj-yXj; anda bijection o : X — X such that for all 
i € J, there exists j € J such that o(X;) = Xj. Fork € Zo, let 


Cy = {x € X | k is the smallest positive integer such that x, ok (x)EeX; (8) 


for some j € J}. 


According to Theorem 1, the unique maximal commutative subalgebra of A x5 Z 
that contains A is precisely the set of elements 


a= {Yo |foralln eZ: Alsen =o} (9) 


neZ 
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where Sep", (X) is given by (2). We have the following theorem which gives the 
description of Sep”, (X) in this case and is crucial in the description if the maximal 
commutative subalgebra. 


Theorem 4 Leto : X — X bea bijection on X as given above, 6 : Ax > Ax be 
the automorphism on Ax induced by o and C,, be given by (8). Then for everyn € Z, 


Sep (0) = 1| /GUCs fb, (10) 
ktn 


where 
Coo = {X; €P : 0 (Xj) AX; Ve > 1}. 


Proof 1. Ifn =O (mod k) and x € X; € Cx, the we can write n = mk for some 
m € Z. Then, since o*(X;) = X; it follows that at (X;) = Xj; and therefore 
for every h € A, 


a"(h)(x) = 6" (h)(x) = (hoa) (x) = h(a (x)) = A(x), 


since x and o~""*(x) € X; for allm e€ Z. 
2. Ifn #0 (mod k), we can write n = mk + j where m, j € Z with 1 <j <k. 
It follows that for every x € X; € Cy, 


a" (h)(x) = 6" (h)(x) 
= (hoa ™*/)(x) 
= hia @)) 


= 6/(h)(x). 
But k is the smallest integer such that o* (X j) = Xj. Therefore since j < k, 
ai (h)(x) # A(x). 
Hence 


Sep",(X) = {x EX |AheE A: h(x) £6"(h)(a)} 


_ |{Uj:xecXj} ifn =0 (mod &), 
7 {U;:x,ec,X;} ifn #0 (mod &), 


and if x € Co, then obviously x € Sep”, for every n > 1, or simply 


Sep’, (R) = |) Cr U Cov. 
ktn 
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From the above theorem, the description of the maximal commutative subalgebra 
in A xg Z can be done as follows. 


Theorem 5 Let Ay be the algebra of piece-wise constant functions f : X > C, 
o : X => X any bijection on X,0 : Ay — Ay the automorphism on Ay induced by 
o and Cy be as described above. Then the unique maximal commutative subalgebra 
of Ax =¢@ Z that contains Ax is given by 


A’ = >, ~ qj, XX jn 6” 


neZ:k\n \ jnet 


Proof From (9) we have that the unique maximal commutative subalgebra of Ay x¢ 
Z, that contains Ay is precisely the set of elements 


A = {> 6" Horan Falsept, (x) = r 


neZ 


and from (2), 


Sep’, (R) = (Jc. (11) 
ktn 


Combining the two results and using the definition of h, € Ax as 


btn = Diy XX 


ined 


we get 


A’ = o> ~ aj, XX j, 6" 


neZ:k\n \ jnet 
oO 


It can be observed from the results in Theorem 4 that it is possible to have 
Sep", (X) = X for alln € Z. For example, suppose J is infinite and leto : X > X 
be a bijection such that o (X;) = Xj+1 for every j € J. Then it is easily seen that in 
in this case Sep", (X) = X. However, this is not possible if J is finite since in this 
case o acts like a permutation on a finite group. In the following section, we treat one 
such a case. We let X = R and Ay be the algebra of piece-wise constant functions 
on R with N fixed jump points, where N > 1 is an integer. In order to work in the 
setting described before, we treat jump points as intervals of zero length. Then R is 
partitioned into 2N + | sub-intervals. 
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4 Algebra of Piece-Wise Constant Functions on the Real 
Line with N Fixed Jump Points 


Let A be the algebra of piece-wise constant functions f : R — R with Nfixed jumps 


at points ft, f%,..., fy. Partition R into N + 1 intervals Jo, ,,..., [y where I, = 
]te, te+il with tg = —oo and ty+; = o. By looking at jump points as intervals of 
zero length, we can write R = UJ, where J, is as described above fora = 0, 1,...N 


and J, = {ty} if a > N. Then for every h € A we have 


2N 


AO) = > ati; (12) 


a=0 


where x,, is the characteristic function of J,. As in the preceding section, we let 
o :R— R be any bijection on R and let a : A — A be the automorphism on A 
induced by o. Then we have the following lemma which gives the necessary and 
sufficient conditions for (R, 0) to be a discrete dynamical system. 


Lemma 3 The algebra A is invariant under both o and o~' if and only if the 
following conditions hold. 


1. o (anda!) maps the each jump point t,, k = 1,..., N onto another jump point. 
2. o maps every interval Iy, a = 0, 1,...N bijectively onto any of the other inter- 
vals Ip, 1, ... In. 


Proof Obviously, if the two conditions hold, then A is invariant under o. So we 
suppose that A is invariant under o and prove that the two conditions must hold. 


1. Suppose o (t) = to ¢ {ti, t,..., tn} for some k € {1,2,..., N}. Then, since o 
is onto, there exists x9 € R such that o (xo) = t, that is, there exists a non jump 
point that is mapped onto a jump point. We show that this is not possible. 


Let 
1 if x=, 
h(x)= : 
0 otherwise. 


Then h € A. But 


0 otherwise, 0 otherwise. 


hoo |I if o(@) =h, =f if x =X, 


Therefore h oo ¢ A which is a contradiction, implying that o does not map a 
non jump point onto a jump point, proving the first condition. 
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2. Consider the bijection o : R — R defined by 


4 if 2 At, ort, 


o@)= 44, ifx=4, (13) 
, UH 4. 
where di e€ i, and i € Ii forsomek € {1,2,..., N}. Theno is a bijection that 


permutes the jump points. Let h € A. Then using (12) and for the o in Eq. (13) 
above, we have: 
h(x) if x At, ort, 
O(x) = 4 ae41 ifx = i 


‘0 ” 
a ifx=h. 


Therefore, h o o has jumps at points t),..., tv, iis a implying thathoo € A. 
oO 


The following theorem gives the description of Sep”, (R) for any n € Z. 


Theorem 6 LetA be analgebra of piece-wise constant functions with N fixed jumps 
at points t},...,tn, 0 : R > R be any bijection on R such that A is invariant under 
o and let a : A > A be the automorphism on A induced by o. Let 


Ce= {x € R|k is the smallest positive integer such that x, o* (x) Ely (14) 
for somea =0,...2N}. 


Then for everyn € Z, 


Sep" (R) = J Cx. (15) 
ktn 


Proof See Theorem 4 and observe that C,, = @ in this case. oO 


Example I Let A be the algebra of piece-wise constant functions with 4—fixed jump 
points at ft), f, f3, t4. Partition R into five subintervals Jp, ... , [4 where Iy =Jty, fe+il 
with fo = —oo and ts = oo. 

Leto: R— R be a bijection such that oo) =), o()) = h, c(h) = ho, 
o (13) = ly and o(ly) = hh. It follows that 03(Jo) = Ip, 0° (1) = I, and o3(h) = 
Ih. But o/ Iy) A Ty fora = 0, 1,2 and 1 < j <3. 

Also o?(3) = 13, o7(14) = Ig but o/ (Iq) A Iy if | 4 0 (mod 2) witha = 3, 4. 
Therefore: 


Sep", (R) = {x ER| AREA: h(x) £E"(h\(x)} 
=R\({hUL}U lh: 02) =t%, k=1,2,3,4}} ifn =0 (mod 2) 
={hUh UD} Ul: 28%) 4%, k=1,2,3,4}ifn =0 (mod 2), 
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and 


Sep", (R) = {x ER| Fhe A: h(x) G"(A)(x)} 
=R\{hU{hUb} Uk: 8%) =, k=1,2,3,4}}if2 =0 (mod 3) 
={hUL}U{t: (ty) Ate, k=1,2,3,4} ifn =0 (mod 3). 


From these results we have the following theorem. 


Theorem 7 Let A be the algebra of piece-wise constant functions f : R — R with 
N fixed jumps at points ty, t2,..., ty. Partition R into N + 1 intervals Ip, 1,,..., In 
where Iy =te, te+i1[ with to = —co and ty4, = coand Iy = {ty}forN+1<M< 
2N. Leto : R > R be any bijection on R such that A is invariant under o and let 
ao :A— A be the automorphism on A induced by o. Let 


Ce= {x € R|k is the smallest positive integer such that x, o* (x) Ely (16) 
for somea =0,...2N}. 


Then the unique maximal commutative subalgebra of A xg Z that contains A is 
given by 


= 
I 


2N 
3 (> sata) 


néeZ:k\n \on=0 


Proof From (9) we have that the unique maximal commutative subalgebra of A x5 Z 
that contains A is precisely the set of elements 


A = pa fn6" | forallne Z: Ful sep, (x) — | 5 


neZ 


and from (11), 


Sep’, (R) = (J. 
ktn 


Combining the two results and using the definition of h, € A as 


2N 


hy = > Gow, X Dey 9 


An =0 


we get 


2N 
A = »> (> cath) 5" 


néeZ:k\n \o,=0 
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5 Some Examples 


In this section we give some examples of how our results hold for well known simple 
cases. We treat two cases of piece-wise constant functions on the real line; those with 
one fixed jump point and those with two fixed jump points. 


5.1 Piece-Wise Constant Functions with One Jump Point 


Let A be the collection of all piece-wise constant functions on the real line with one 
fixed jump point fo. Following the methods in the previous section R is partitioned 
into three intervals [9 = (—0o, fo), 1) = (tp, 00) and Jp = {to}. Then we can write 


heAas 
2 


h= > dax1, = Xm + 4Xn + Xn: (17) 
a=0 


Leto : RR > Rbeany bijection on R and let o be the automorphism on A induced 
by o. Note that by the first part of Lemma 3, invariance of the algebra A implies that 
o (to) = to. It follows therefore that o (Jy) = Ig or oI) = I. We treat these two 
cases below. 


5.1.1 o(lo) = Ih 


In this case (and by bijectivity of 7), we have that o(J;) = J) and since o (to) = fo, 
then for every x € R, he Aandne Z 


o"h(x) :=hoo “(x) =h(x), 


since x and 0~"(x) will lie in the same interval. Therefore, all intervals Jy, a = 
0, 1, 2 belong to C; and hence 


Sep", (R) = U C, = 0. 
ktn 


Therefore, the maximal commutative subalgebra will be given by 


A= paras |forallne Z: Sul sep, (X) -o| 


neZ 


=A», Z. 
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5.1.2 o(f) =h 


In this case (and by bijectivity of 7), we have that o (J;) = Jp and since o (to) = fo, 
then for every x € R, h € Aandn € Z such that 2 | n we have 


o"h(x) := hoa “"(x) =h(x), 


since x and o~"(x) will lie in the same interval. And for odd n, 6"(h)(x) = h(x) if 
and only if x = fo. Therefore, we have, 


Ci = (Ig | Ue) = Ia} = he, 


and 
O=(e |e’) = hl} =hUA. 
Therefore, 
C, ifk=1, 
Sep", (R)=|]Q. = 
epaR) Uc ( ifk =2. 


Therefore, the maximal commutative subalgebra will be given by 


2 
A’ = -X (2 catt,) 


neZ:k\n \a,=0 


4. (Sor) 


néeZ: 2|n \a=0 
7 pan + aimXn + 2.mXp)o™ = > (42,m Xn) onl A 
meZ mez 


5.2 Piece-Wise Constant Functions with Two Jump Points 


Let A be the collection of all piece-wise constant functions on the real line with 
two fixed jump points at fp and t;. Following the methods in the previous section R 
is partitioned into intervals Jo =] — 00, f[, M1 =Jto.4[ b=lh, wl, & = {to} 
and J, = {t,}. Then we can write h € A as 


4 
h= > Gexi. (18) 
a=0 
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Leto : R — Rbeany bijection on R and let o be the automorphism on A induced 
by o. Note that by the first part of Lemma 3, invariance of the algebra A implies that 
oO (to) = to (and o(t,) = t;) or o (fo) = ft, (in which case o (t,) = fg). Below we give 
a description for the maximal commutative subalgebra of A x5 Z for different types 
of o. 


5.2.1 o(U,) = I, for alla =0,...,4 
This case is similar to the one in Sect.5.1.1 in the sense that, forevery x € R, he A 
andn € Z 

o"h(x) := hoa “(x)= h(x), 


since x and 0~"(x) will lie in the same interval. Therefore, all intervals Jy, a = 
0,...,4 belong to C; and hence 


Sep’, (R) =| =8. 
ktn 


Therefore, the maximal commutative subalgebra will be given by 


A = [> ne |forallne Z: Srl sep’, (X) -o| 


neZ 
=A Me Z. 
5.2.2. oo) = h, o(h) = Ip ando(,) = In, « = 2,3,4 


In this case (and by bijectivity of 0), we have that o (J,) = Jp and therefore for every 
x €R, Ah € Aandn € Z such that 2 | n we have 


o"h(x) := hoa “(x) =h(x), 


since x and 0 ~"(x) will lie in the same interval. And for odd n, o”"(h)(x) = h(x) if 
and only if x € 4 UI; U ly. Therefore, we have, 


Ci = {la | oUe) = Ik} = hUBVU I, 


and 
Q@={ ot) = hd =u. 


Therefore, 


Crossed Product Algebras for Piece-Wise Constant Functions 91 


Sep’ (R) =| JC = 


i ifk = 1, 
ktn 


GO ifk=2. 


It follows that for n € Z such that 2 | n, the maximal commutative subalgebra 
will be given by 


= 
I 


2N 
PS bs satin) 


neZ:k\n \a,=0 


2N 
— >. (> satin) 


néeZ : 2\n \a,=0 


4 
_ > (> cent on, 


meZ \a=0 


And for odd n, we have 


A, = DV @nxh + 43,m XI, d4,mX1,)8"« 


n 


Therefore, the commutant A’ is given by: 


4 
A= pS (x cunt ae + V@mxr + 43,mXk + can : 


meZ \a=0 meZ 


Similar results can be obtained for the following cases 


0H) =h, oN=h, c03)=h, oy) = bh ando(h) = h. 
- 0) =h, oh) =IbandoWdy)=lh, a= 1,3,4. 
-0b)=h, o(h)=h, cUa)=h oy) = bando(h)=h. 
-0Q)=h, ch) =hando(h)=Ih a =0,3,4. 

~00)=h, o(h)=h, o(a)=h, oy) = b ando(lo) = lo. 


AWN eR 


Since in all these cases, o*([y) = Jy, a =0,...,4. 


5.2.3 o(h)) =h, o(h) = h, o(h) = Ih ando(ly) = hk, a = 3,4 
In this case, using similar methods we have, 


Ci = {ly |oUy) = Igy} =UBU KL, Cr = Y, 
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and 
C3 = {Ta | ae Us) = Ty} =/U,Uh. 
Therefore, 
C3 ifk 43, 
S n R)= G= 3 
eae U : ( ifk = 3. 


It follows that for n € Z such that 3 | n, the maximal commutative subalgebra 
will be given by 


2N 
we) > (> cata) 


neZ:k\n \an=0 


LS (> cat) 


neZ : 3\n \a=0 
4 
= > (x da,m x) oo . 
meZ \a=0 
If 3 {n, then 
A, = V@nxr + G4.nX1,)8". 
Therefore: 
4 
1 {E(Zpease e+ Dierann taza 
meZ \a=0 a 


5.2.4 oo) =h, oh) =h, oh) = Ip and o 3) = Ly, o Uy) = I 
In this case, using similar methods we have, 
C, = 9, C.=hUl, 


and 
C3 = [Ig | 07° Uy) = In} = HUAUh. 


Therefore, 

R\C; ifk=3, 

Sep’ (R)=|JCQe=4R\Q ifk =2, 
kn R ifk=1. 
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It follows that for n € Z such that 3 | n, the maximal commutative subalgebra 
will be given by 


2N 


A = > >» Ae, Xin, JO 


neZ:k\n \on=0 


2N 


= >? > ee, X Tan 5” 


neéZ : 3\n \on=0 


= a, (40,mX a a1mXh a a2.mXh) a 


meZ 


If 2 | n, then 
5 = DV @mxn + 4m X1,)0-", 


meZ 


and for all other values of n, A’ = A. Hence: 


A’ = >. (40,mXto T 41 mXh, 7a 2,mXt) e” a DV @mxr + am X1,)5°™ 


meZ meZ 
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Commutants in Crossed Product Algebras 
for Piece-Wise Constant Functions 


Johan Richter, Sergei Silvestrov and Alex Behakanira Tumwesigye 


Abstract In this paper we consider crossed product algebras of algebras of 
piece-wise constant functions on the real line with Z. For an increasing sequence of 
algebras (in which case the commutants form a decreasing sequence), we describe 
the set difference between the corresponding commutants. 


Keywords Piecewise constant - Crossed products - Commutant 


1 Introduction 


An important direction of investigation for any class of non-commutative algebras 
and rings, is the description of commutative subalgebras and commutative subrings. 
This is because such a description allows one to relate representation theory, non- 
commutative properties, graded structures, ideals and subalgebras, homological and 
other properties of non-commutative algebras to spectral theory, duality, algebraic 
geometry and topology naturally associated with commutative algebras. In represen- 
tation theory, for example, semi-direct products or crossed products play a central 
role in the construction and classification of representations using the method of 
induced representations. When a non-commutative algebra is given, one looks for 
a subalgebra such that its representations can be studied and classified more easily 
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and such that the whole algebra can be decomposed as a crossed product of this 
subalgebra by a suitable action. 

When one has found a way to present a non-commutative algebra as a crossed 
product of a commutative subalgebra by some action on it, then it is important to 
know whether the subalgebra is maximal commutative, or if not, to find a maximal 
commutative subalgebra containing the given subalgebra. This maximality of a 
commutative subalgebra and related properties of the action are intimately related to 
the description and classification of representations of the non-commutative algebra. 

Some work has been done in this direction [2, 4, 6] where the interplay between 
topological dynamics of the action on one hand and the algebraic property of the 
commutative subalgebra in the C*—crossed product algebra C(X) x Z being max- 
imal commutative on the other hand are considered. In [4], an explicit description 
of the (unique) maximal commutative subalgebra containing a subalgebra A of C* 
is given. In [3], properties of commutative subrings and ideals in non-commutative 
algebraic crossed products by arbitrary groups are investigated and a description of 
the commutant of the base coefficient subring in the crossed product ring is given. 
More results on commutants in crossed products and dynamical systems can be found 
in [1, 5] and the references therein. 

In this article, we consider algebras of piece-wise constant functions on the real 
line. In [7], a description of the maximal commutative subalgebra of the crossed prod- 
uct algebra of the said algebra with Z was given for the case where we have N fixed 
jumps. Given the algebras A,, of piece-wise constant functions with a fixed jump at 
t; we take a sum of M such algebras. This yields an algebra of piece-wise constant 
functions with at most M jumps at points t),..., fy. Since A, ,, K=1,...,M 
is an increasing sequence of algebras, the commutants A), K=1,..., M form 
a decreasing sequence of algebras so we compute the difference between the said 
commutants. 


2 Definitions and a Preliminary Result 


Let A be any commutative algebra. Using the notation in [4], we let 6 : A > A be 
any algebra automorphism on A and define 


A xy Z:={f:Z—> A: f(n) = 0 except for a finite number of n}. 


It can be shown that A xy Z is an associative C— algebra with respect to point-wise 
addition, scalar multiplication and multiplication defined by twisted convolution, * 
as follows: 

(f *g)(n) = >) fOO*(g(n—b), 


keZ 


where #* denotes the k—fold composition of ¢ with itself for positive k and we use 
the obvious definition for k < 0. 
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Definition 1 A x4 Z as described above is called the crossed product algebra of A 
and Z under @¢. 


A useful and convenient way of working with A x4 Z, is to write elements f, g € 
A x Zin the form f = >) <7 fn6” and g = >) ,¢7 8m5” where fr = f(n), &m = 


g(m) and 
1, ifk= 
May” 
0, ifk <n. 


Inthe sum >°.-7 fn6”, we implicitly assume that f, = 0 except for a finite number 


of. Addition and scalar multiplication are canonically defined by the usual pointwise 
operations and multiplication is determined by the relation 


(fn5") * (8m5"") = fre” (go, (1) 


where m,n € Zand fn, 8m € A. 


Definition 2 By the commutant A’of A in A x4 Z we mean 
A’ :={f €AxgZ: fg = gf for every g € A}. 


It has been proven [4] that the commutant A’ is commutative and thus, is the 
unique maximal commutative subalgebra containing A. For any f,g € A x¢ Z, 


that is, f = >) ez fn6” and g = >) ez m5", fg = af if and only if 


Wir YY Ae Gd = > gab 


neZ meZ 


Now let X be any set and A an algebra of complex valued functions on X. 
Let o : X — X be any bijection such that A is invariant under o and o~!, that is 
foreveryhe A, hoo € Aandhoo! € A. Then (X,c) is a discrete dynamical 
system and o induces an automorphism o : A > A defined by, (f) = fool. 
In [7], a description of the commutant of A’ in the crossed product algebra A xg Z 
for the case where A is the algebra of functions that are constant on the sets of a 
partition was given. Below are some definitions and results that will be important in 
our study. The proofs of the theorems can be found in [7] and the references in there. 


Definition 3 For any nonzero n € Z, we set 
Sep’, (X) ={xe X|aAheA : h(x) Fo"(h)(x)}, (2) 


The following theorem has been proven in [4]. 


Theorem 1 The unique maximal commutative subalgebra of A x3 Z that contains 
A is precisely the set of elements 


98 J. Richter et al. 


A’ = [> ne |forallne Z: fsersc = 0]. (3) 
neZ 
We observe that since ¢ = f oo7!, then 
6(f) =a(f oo) =(foa joo! = foo, 


and hence o"(f) = foo”. 


3 Algebra of Piece-Wise Constant Functions on the Real 
Line with N Fixed Jump Points 


Let A be the algebra of piece-wise constant functions f : R > R with N fixed jumps 


at points f, f2,..., ty. Partition R into N + 1 intervals Jp, 1|,..., [y where Ig = 
(ty, te+1) with fg = —oo and ty +); = oo. By looking at jump points as intervals of zero 
length, we can write R = Ul, where J, is as described above fora = 0,1,...,N 


and Iy = {ty} for N + 1< M < 2N. Then for every h € A we have 


2N 


h(x) = >) daxi,(%), (4) 


a=0 


where x;, is the characteristic function of J, and dy are some constants. As in the 
preceding section, we let o : R — R be any bijection on R such that A is invariant 
under o and let ao : A — A be the automorphism on A induced by o. Then we have 
the following lemma which gives the necessary and sufficient conditions for (R, 7) 
to be a discrete dynamical system. 


Lemma 1 The algebra A is invariant under both o and o~' if and only if the 
following conditions hold. 


1. o (and o~') maps each jump point t., k =1,..., N onto another jump point. 
2. o maps every interval I,, a =0,1,...,N bijectively onto any of the other 
intervals Ip, 1,..., In. 


The following theorem gives the description of Sep”, (IR) for any n € Z. 


Theorem 2 Leto : R > R be any bijection on R and let a : A > A be the auto- 
morphism on A induced by o. Let 


Ce= {x € R|k is the smallest positive integer such that x, o* (x) Ely (5) 
for somea =0,1,...,2N}. 


Commutants in Crossed Product Algebras for Piece-Wise Constant Functions 99 


Then for everyn € Z, 


Sep’, (R) = (Jc. (6) 

ktn 
Theorem 3 Let A be the algebra of piece-wise constant functions f :R— R 
with N fixed jumps at points t), t2,...,ty. We partition R into N + 1 intervals 
Io, 1, ..., In where Iy = (ty, ta+1) with ty = —00 and ty4,; = co. Leto: R—>R 


be any bijection on R such that A is invariant under both o and o~! and let 
a:A—Ai be the automorphism on A induced by o. Then the unique maximal 
commutative subalgebra of A xg Z that contains A is given by, 


A’ = [S10 <0 rete) 


neZ, 
where Cx is as defined in (5). 


Proof From Theorem 1, we have that the unique maximal commutative subalgebra 
of A x5 Z that contains A is precisely the set of elements 


A = [> 0 |forallne Z: Fal sep", (X) — o. 


neZ 


and from (6), 


Sep’, (R) = (Jc. 
ktn 


Combining the two results, we get 


A = [S04 =0onC, rein) 


neZ 


4 Comparison of Commutants 


In this section, we give an explicit description of the set difference between commu- 
tants of an increasing finite sequence of algebras of piece-wise constant functions. 
Starting with an algebra of piece-wise constant functions with N fixed jumps at points 
Eis ee-5 ENG A, ee 


_ ty} and the 
algebras of piece-wise constant functions at these points. In this way, we obtain a 
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finite increasing sequence of algebras whose commutants (under some conditions), 
form a decreasing sequence. We give the description as follows. 

Let T = {t,...,ty, tya1,---,tv+m} for some N, M €N such that 4 < ¢; if 
i < j. For t. € T, let A;, be the algebra of piece-wise constant functions with a 
fixed jump at f, and for K ¢€ {1,..., N+ M}, let 


K 
Aithiutt) = {da wie fo A.) 


a=1 


That is, 
K 
Apr, Pee tk} = > An: 
a=1 
Then Ay,...r¢} Consists of piece-wise constant functions with at most K jump 


points at points f),..., fx. It follows immediately that Ag, +; © Age,....4.) and 
therefore the commutants satisfy the relation Aj, ,,; © Aus,...4,) for every J, K € 
{1,2,..., N + M} such that J < K. Observe that if A is a subalgebra of an algebra 
B of functions and o is a bijection such that both A and B are invariant under o, then 
A xg Z is a subalgebra of B x3 Z. In our case, we take A = Ay...) the algebra 
of piece-wise constant functions with jumps at t),..., ty and B= Ay, ty,y}, the 
algebra of piece-wise constant functions with jumps at f), ..., fy+.. In Lemma 2 we 
both invariant under o. In this case Ag, yy 4a ZS Agu... 
we can compare the commutants Ay fanaiy Od A; wy} Tespectively. 

For a € {0,1,..., N}, let Jy = (te, te41) with t9 = —oo and ty4; = +00 such 
that Jy = (ty, co) and fori = 0,1,..., M, let red = (tyai, tyei4i1) With try yo = 
oo. In order to be in the setting in [7], we define, fora = 1,...,N, In+a = {ty} and 
fori=1,...,M, pe = {ty,;}. Then we have the following. 


Lemma 2 Leto: R — R bea bijection such that Ay, 
invariant under o. Then o (In) = In. 


, are both 


sities tnim 


Proof Supposeo ([y) 4 Iy.Thenby Lemma 1,o (Jy) = J, forsomea € {0,1,..., 
N — l}and since ty; € Jy foreachi = 1,..., M,andsince Ay, ¢y,,,) iS invariant 
under o (and o([y) 4 Iy), then o(ty+;) € {t1,..., tv} for every i € {1,..., M}. 
But also o(ty) € {t),..., ty} foralla =1,...,N. 

Therefore o({t),...,tn,tnsi,---,tnim}) = {t,...,tn} which contradicts 
bijectivity of o. 


{tiy-stvem}" 


OL: Arr, ganes tn} —> Ary, 
for every f € Ay, 
62 on Ay, 


giesey 
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and 


(R) := {x ER] Fh Ee Ay, 


(thetn4¢M) 


Then it can easily be seen that 


(R). 


Ley 


Peery tn+m} 


Using the notation in [7], we let 


i twan) | AQ) F Gy (h)(X)}. 


Cy = {x € R | k is the smallest positive integer such that x, o*(x) € Iy 


forsomea = 0,1,...,2N}. 


Let also 


en — {x € Iy | k is the smallest positive integer such that x, o* (x) E i 
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for somei = 1,..., M}. 
Then we have the following Theorem 4. 
Theorem 4 
I. ForeveryneZ 
Sep'r,, ves waar OS) > Sepa, yes wR) U (Linx) : 


2. and therefore the commutants satisfy: 


Eli gaining tnim} = Avy, gRa54 ty} \ {> fd" | for somen € Z, th x 0 


neZ 


on some Cy with k t n| : 


Proof 


= Sep, @U (Vinx) 


2. Now let us consider the commutants. By Theorem 1, the commutant Ay, 
given by 


sees) 


ty} 1S 
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tt = = ps fro" | foralln eZ: Snlsep's, Wy ® — : 


eottCtCSti‘<—=~<S tt 


pa fr8" | frtx) FA Oifx € | Cc Ait, ty}* 


neZ 


Now 


! = n 5 = 
Me ~ i be oe | ene Iulser'n,, aie a = | 


neZ 


a o\fDae for somen € Z, fr £0 


neZ, 


on some C, with k | n| ; 


4.1 An Example 


As application of our results, we consider the case when only one jump point is 
introduced and give an explicit description. Suppose Ay, , ) and A, +, 
algebras defined respectively as follows: 


are 


eosey tn geeptNaal 


N 
Pins ua [Dai ets (ate 


i=1 


and 


N 
Mtns ~~ —— [> | Si € Aj, = 1, ~ - “ 1 ~ Alt t,..stw} + (lta) 


i=1 


Corollary 1 Leto: R > R be a bijection such that Aggy .ry,...,1yy ANd Ayn n,...tyi} 
are both invariant under o. Let I, = (ty, ty41) and Ix, = (ty41, 00). Then, we have 
the following. 
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1. SePlay ‘ ww) Cc Sepa... 7 re ®) for every n € Z and moreover 
p, fo) = Ih, 
IyUly, ifoUdy) = In 
SEP A cy ai \ SEPA » aid wy ®) = and n is odd, 
0, ifoUdy) = In 


and n is even. 


2. Ifo (Uy) = In, then 


Aunts os tye} — Ain, to,...ty} {> find" | fom41 FO on Ty U | . 


meZ 
Proof 


1. Since Ag, .t.,...tyy and Agi, n,...4y,,} are both invariant under o, then by Lemma 
2 we have that o(/y) = Jy and hence o(ty+1) = ty+,; where Iy = (ty, 0). 
It follows therefore that Jy ¢ Sep'r., 7 wy ®) foreveryn € Z. Observe thatif J, C 
Sep'n,,, 7 ww ®) forsomen € Z,k =0,1,..., N — 1, then Ik C Sep’, a wi): 
Now consider the action of o on Iy and let Ix, = (ty, ty) and IX, = (ty+1, 0). 
Then we have the following: 


a. Ifo (Uy) = Ty, then since o (ty+1) = ty+1 (and o is a bijection), then Iy ¢ 
(R) for every n € Z and hence, 
N+1 


SEP A iy.iy RO) = Sev’, 


(R) 


ny-stN4d} 


for every n € Z. 
b. IfoUy) = 1, then Ih, Ih, C Sep, B® and hence 


Ty, 14 C Sep’n, 
for each odd n € Z. It follows therefore that 


SEP hu. CR) € Sep. 


QtN+1) 


(R) 


gees tn} 
for every n € Z. 


2. Using Theorem 1, the commutant Ai, Baits 18 given by 


neZ 


Mieits ae tyai} pa f,8" | Snlsep'ny 4. ‘ctl 7 _ | : 
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We have the following cases: 


a. If oy) = Ty, then 


Sep", ww ®) = Sepla,,,.. CR) 


Preeey nystN4d} 


for every n € Z, and hence 


Fd rd 
{ti,to,-tvgi} {tot} 


b. Ifo(1y) = 1%), then 
Sepa! QR) = Sepa! (RYU (Iy UN), 


and hence 


Aves ine tyat} Ain ssieg ty} \ {> nd" | for some m € Z, 


meZ 


foot # OT, UTA 


5 Description of the Center 


Below we give the description of the center of our crossed product algebra A x5 Z. 
This center will be the commutant of some subset of the crossed product as can be 
seen from Remark | below. The lemma below will be important in our considerations. 


Lemma 3 Let B C A be a subset of an associative C—algebra A and let B be the 
algebra generated by B. Then B' = 8B’ where B’ and B' denote the commutants of 
B and B respectively. 

The following theorem whose proof can be found in [4], gives the description of 
the center of a crossed product algebra A x3 Z where A = C*. 
Theorem 5 LetA ¢ C%* be analgebra of functions that is invariant under a bijection 
ao: X — X.Anelement g = > 67, 8m5" is in Z(A ™¢5 Z) if and only if both of the 
following conditions are satisfied: 
1. forallm € Z, gm is Z—invariant, and 
2. forallm € Z, &m|sepn (x) = 0. 

In the theorem below we give the description of the center Z(A = 5 Z) for the 


case when A is the algebra of piece-wise constant functions with N jumps. First we 
make a few observations. 
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Recall the definition of C, as given by (5). By Lemma | and bijectivity of o, if 
Iy C Cx, then o* (Ig) = [,. Therefore each C; consists of cycles of intervals which 
we denote by O}, and each Oj can be written as 
(en eenere ie 
such that o(1j,) = Ts for j =1,...,k — Land o(Ij,) = Ij. Using these cycles 
and Theorem 5 we give a description of the center below. 


Theorem 6 Let A be the algebra of piece-wise constant functions f : R — R with 
N fixed jumps at points t,, ty, ..., ty as described in section 3. Leto :R — R be 
any bijection on R such that A is invariant under o and o™! and let : A > A be 
the automorphism on A induced by o. Then 


Z(A x5 Z) = {> fn6" | fn is constant on every cycle 0} in Cx, 
neZ 


for all k such that k | n}. 


Proof By the second part of Theorem 5, an element f = >) ,¢7 fnd" € Z(A x6 Z) 
only if for all m € Z, 8m|sepm (x) = 0. From (6), we have that Sep’, (R) = ies Cy. 
Therefore, f € Z(A x Z) onlyif f, =0o0n Ig : o" (Ia) # Jy. Or equivalently, f € 
Z(A &¢5 Z) only if f, = 0 on C;, for all k { n. 

Also by the first part of Theorem 5, assuming the condition above holds, then f € 
Z(A xg Z) if and only if for alln € Z, f, is Z—invariant, that is, for all n € Z 
andallx ER, f,(o(«)) = fi, (x). From above, f, = 0 on each C, such that k { n. 
Now consider C; such that k | n. As observed above, such C,; consists of cycles of 
intervals of length k denoted by Oj, where each O} can be written as 

Oa ecad) 
such thato(Ij,) = 1j,,, for j = 1,...,k — Lando(Zj,) = Ij,. Since foreachn, fy 
is a piece-wise constant function and f,, = 0 on each C;, for which k {n, then f, 
being Z—invariant is equivalent to saying f, takes a constant value on each of the 
cycles O} in Cx. 


Remark I One question of interest would be to compare the set difference of the 
commutant Aj, i) \ Aba. tvey}) With the commutant (Age stan} \ Atti os ‘a in 
the crossed product algebra Ay, ) xg Z. 

Observe that (Afi,....tven) \Ajn,...ty}) iS not a subalgebra of Ay, 1....1y,,, but still 
it’s commutant would be a subalgebra of the crossed product algebra Aj... ty, 
“¢Z. By Lemma 3, if B is any subset of an associative C—algebra A and B is 
the algebra generated by B then B’ = B’. Therefore if we let A = A, x5 Z 
wees tvem “Se ty Xe Z, then the commutant B’ = B’ where B is 
the algebra generated by B. Now, it is easily seen that if C= Ay iyi \ Ansty 
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then the algebra C generated by C is A,,,....+y,,,- It follows therefore that the algebra 
generated by B is the whole crossed product algebra A;, 5 Z. Therefore to 


see lN+M 


find the commutant B’ of B is the equivalent to finding the center Z(A,, 1,4, “6 Z). 
6 Jump Points Added into Different Intervals 

Let {t;,..., ty} be a set of points in R such that t) < tf <... < ty and let A be an 
algebra of piece-wise constant functions with N fixed jumps at points f), ..., ty. Let 
S = {s1,...,5m}beasetinR, m < N such that tj;_) < s; <tj, j =1,...,m.Let 
As, j =1,...,m be the algebra of piece-wise constant with a fixed jump at s; and 
define As by 

As =A+t >) As,. 
j=l 


Then As is the algebra of piece-wise constant functions with at most N + m jumps 
at points t),..., ty, 51,...,5m- It can be seen obviously that A C Ag and therefore 
Ay Cc A’, (under some conditions), where A’ and Ay denote the commutant for A 
and As respectively. In this section we describe the set of separation points for As 
and then compare the commutants A’ and A’. 


Using the methods in [7], let Jy = (ta, te+1) fora = 0,1,..., N with fy = —co 
and ty+; = co and Inyyo41 = {te}, a= 1,...,.N. Now, for functions in As, a 
jump point is introduced in each of the intervals I,, a =0,...,m, therefore 


each of these intervals is divided into three subintervals 1) = (ty, Se41), I = 
(Sui, tapi), and I” = {sy41}. We have the following. 


Lemma 4 Leto: R — R be a bijection such that the algebras A and As are both 
invariant under o. Then 
og (Ucar Pai ga ale 


Proof Suppose for some @ € {0,1,...,m}, oU,) = Ig for some B > m. Since 
Sut1 € Jy, then o(Sy41) € Ig. By invariance of As under o, o(Sy41) must be jump 
point for some function f € As, which is a contradiction. O 


6.1 Description of S ep'n ,( R) and the Commutant A‘, 


Below we give a description of Sep’, (IR) in terms of Sep”, (R) forn € Z. As before, 
we let C, be as defined in (5), that is 


Cy = es € R | k is the smallest positive integer such that x, o*(x) € Iy 
forsomea = 0,1,...,2N}. 
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Let also 


Cy = {x € C, | k is the smallest positive integer such that x, o* (x) E ly 
for some a = 0, 1,...,2N}, 


and 


Ch = {x € Cx | 2k is the smallest positive integer such that x, ae (x) € Ig 
for some a = 0,1,...,2N}. 


We give the descriptions in the following theorem. 


Theorem 7 For everyn € Z 


Sep (B) =U (C U Cxy2) (7) 
ktn 


and the commutant is given by 


A, =A’ \ {= fn5" | for some n, k such that k is even, k {n and k | 2n 


neZ 


fn FON Cp}. (8) 


Proof Using (6), we have that 


Sep’, (R) = (J. 
ktn 


From the definitions of C ;, and Cc « it follows immediately that 
Ce = Cy UCyp, 


and this proves (7). Now consider the commutant A‘. 
Again, from Theorem |, we have 


A= {30 48! trata 2: fase =o]. (9) 
neZ 


Looking at (7), we observe that if k is odd, then Ck /2 = % and nothing changes 
on the commutant. Therefore taking even k and combining (7) and (9), we get 
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ya \ > fn5" | for some n, k such that k is even, k { n and k | 2n, 


neZ 


fn #0 0n Cyj2}. 
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Asymptotic Expansions for Moment 
Functionals of Perturbed Discrete Time 
Semi-Markov Processes 


Mikael Petersson 


Abstract In this paper we study moment functionals of mixed power-exponential 
type for non-linearly perturbed semi-Markov processes in discrete time. Conditions 
under which the moment functionals of interest can be expanded in asymptotic power 
series with respect to the perturbation parameter are given. We show how the coef- 
ficients in these expansions can be computed from explicit recursive formulas. In 
particular, the results of the present paper have applications for studies of quasi- 
stationary distributions. 


Keywords Semi-Markov process - Perturbation - Asymptotic expansion - Renewal 
equation - Solidarity property - First hitting time 


1 Introduction 


The aim of this paper is to present asymptotic power series expansions for some 
important moment functionals of non-linearly perturbed semi-Markov processes in 
discrete time and to show how the coefficients in these expansions can be calculated 
from explicit recursive formulas. These asymptotic expansions play a fundamental 
role for the main result in [6], which is a sequel of the present paper. 

For each ¢ > 0, we let €(n), n = 0, 1,..., be a discrete time semi-Markov 
process on the state space X = {0, 1,..., N}. It is assumed that the process EO(n) 
depends on ¢ in the sense that its transition probabilities Q;,) (n) are continuous at 
€ = 0 when considered as a function of ¢. Thus, we can, for ¢ > 0, interpret the 
process €)(n) as a perturbation of €(n). 

Throughout the paper, we consider the case where the states {1,..., N} is a 
communicating class of states for ¢ small enough. Transitions to state 0 may, or may 
not, be possible both for the perturbed process and the limiting process. It will also 
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be natural to consider state 0 as an absorbing state but the results hold even if this is 
not the case. 

Our main objects of study are the following mixed power-exponential moment 
functionals, 


[e.e) o.e) 
Gi (Pr) = Donegan), of(o.r) = Dinehin(n), (A) 
n=0 n=0 
where p € R,r =0,1,..., i, j,5 € X, 
8) (2) = Pu? =n, wy? > uw}, 


hy = PEG) =%. A) Ano =a), 


and a is the first hitting time of state /. 

As is well known, power moments, exponential moments, and, as in (1), a mixture 
of power and exponential moments, often play important roles in various applications. 
One reason that the moments defined by Eq. (1) is of interest is that the probabili- 
ties iy (n) = P{E(n) = j, pe > 0} satisfy the following discrete time renewal 
equation, 


iij 


Pi (n) = hi @) + >. PP a—He; m), n=0,1,.... 


This can, for example, be used in studies of quasi-stationary distributions as is illus- 
trated in [6]. 

Under the assumption that mixed power-exponential moments for transition prob- 
abilities can be expanded in asymptotic power series with respect to the perturbation 
parameter, we obtain corresponding asymptotic expansions for the moment func- 
tionals in Eq. (1). These expansions together with explicit formulas for calculating 
the coefficients in the expansions are the main results of this paper. 

In order to achieve this, we use methods from [2] where corresponding moment 
functionals for continuous time semi-Markov processes are studied. These meth- 
ods are based on first deriving recursive systems of linear equations connecting the 
moments of interest with moments of transition probabilities and then successively 
build expansions for solutions of such systems. 

Analysis of perturbed Markov chains and semi-Markov processes constitutes a 
large branch of research in applied probability, see, for example, the books [2—4, 7], 
and [1]. More detailed comments on this and additional references are given in [6]. 

Let us now briefly outline the structure of the present paper. In Sect.2 we define 
perturbed discrete time semi-Markov processes and formulate our basic conditions. 
Then, systems of linear equations for exponential moment functionals are derived 
in Sect.3 and in Sect.4 we show convergence for the solutions of these systems. 
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Finally, in Sect. 5, we present the main results which give asymptotic expansions for 
mixed power-exponential moment functionals. 


2 Perturbed Semi-Markov Processes 


In this section we define perturbed discrete time semi-Markov processes and formu- 
late some basic conditions. 

For every ¢ > 0, let (n®, K©), n=0,1,..., be adiscrete time Markov renewal 
process, i.e., a homogeneous Markov chain with state space X x N, where X = 
{0,1,..., N} and N = {1, 2,...}, an initial distribution Q) = P{n? = i}, i € X, 
and transition probabilities which do not depend on the current value of the second 
component, given by 


O(k) =P i= =j, c©, =k| 1 =i, OO = i} k1EN, if eX. 
In this case, it is known that n®, n=0, 1, , is also a Markov chain with state 


space X and transition probabilities, 
py =P = = jl = i| = 3 OP), i,j eX. 


Let us define r (0) = Oandt (n) = «|? +---+«, forn € N. Furthermore, 
forn = 0,1,..., we define v© (n) = max{k : t(k) < n}. The discrete time semi- 
Markov process sassioiaic with the Markov renewal process (7°, «©)) is defined 
by the following relation, 


GO (n) = Mo 2=O1... 


v©)(n)? 


(e) 


and we will refer to Q;; (k) as the transition probabilities of this process. 


In the semi- ee process defined above, we have that (i) 6 are the times 
between successive moments of jumps, (ii) t)() are the moments of the jumps, 
(iii) v(m) are the number of jumps in the interval [0, n], and (iv) n® is the embedded 
Markov chain. 

It is sometimes convenient to write the transition probabilities of the semi-Markov 
process as oy (k) = p? rai (k), where 


FO) =P {eO, = kl =i, WO, =} KEN LF EX, 


are the conditional distributions of transition times. 
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We now define random variables for first hitting times. For each j € X, let i= 


min{n > 1: n® = j} and ue 6) _ = cv”). Then, ve *) and we *) are the first sitiue 
times of state j for the embedded Markov oan a ae semi-Markov process, 
respectively. Note that the random variables ve and we ), which may be improper, 
take values in the set {1,2,..., co}. 

Let us define 


sO = P; {u S71 vf > oO, n=0,1,..., i,j €X, 


and 
2 = =P; {ug > wer, i,j eX. 


Here, and in what follows, we write P;(A®) = P{A® | n{° = i} for any event A®, 
Corresponding notation for conditional expectation will also be used. 
Moment generating functions for distributions of first hitting times are defined by 


[oe] 


89 (0) = Dea) = Ee x (uf > vP), pERLFEX. Q 
n=0 


Furthermore, let us define the following exponential moment functionals for tran- 
sition probabilities, 


loo) 


Pi (p) = De" 0; (n), pER, i,j EX, 
n=0 


where we define Q\) 0) = 0. 
Let us now introduce the following conditions, which we will refer to frequently 
throughout the paper: 
A: (a) De => pi), ase > 0,i ZO0,j EX. 
(b) fim) > fj n),ase>0,neNi#0,jeX. 
. (0) a 
B: g;, >0,1,j #0. 
C: There exists 6 > 0 such that: 
(a) lim supp<..o P);’(B) < 00, for alli #0, j € X. 
(b) $°(B;) € (1, 00), for some i # 0 and ; < Bp. 
It follows from conditions A and B that {1,..., N} is a communicating class of 


states for sufficiently small ¢. Let us also remark that if pS = 0 for alli ¥ 0, it can 
be shown that part (b) of condition C always holds under conditions A, B, and C(a). 
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3 Systems of Linear Equations 


In this section we derive systems of linear equations for exponential moment func- 
tionals. 
We first consider the moment generating functions o, (e), defined by Eq. (2). By 


conditioning on (ni? Ky ry we get for eachi, j £ 0, 


PM=T>de(e HP x (DP > v) InP = KP =R) OPH 


leX k=1 


= -Yero i(k) eh > > E, e? pk+pn) x (of? > v) OM KK). 


140,j k=1 


Relation (3) gives us the following system of linear equations, 


0: (0) = PH (—) + DS vi? (0)b);"(p), i. | FO. (4) 


1A0,j 


In what follows it will often be convenient to use matrix notation. Let us introduce 
the following column vectors, 


0 (0) =[6900)-- $6}0)] 1 #0. ©) 
va 
p(p) = [pt (o) cae Py}(o) | Tes (6) 


For each j 4 0, we also define N x N- matrices ;P” (0) = ips (p)|| where the 
elements are given by 


= 1, ,N, k ; 
P(e) = {2 aataraae ss (7) 


Using (5)-(7), we can write the system (4) in the following matrix form, 
©) (p) = p(p) + jP (p)®(p), j £0. (8) 


Note that the relations given above hold for all p € R even in the case where some 
of the quantities involved take the value infinity. In this case we use the convention 
0 - co = 0 and the equalities may take the form oo = oo. 

Let us now derive a similar type of system for the following exponential moment 
functionals, 
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CO 
of(o) = DiemP [EM =s, uP AUP an}, pER iis eX. 


n=0 


First, note that 

[o.@) 
o(p) = Ei Dex (€O@) = 5, uw Au? > n) 
n=0 


uy Awo=1 


=E De xEOM) =5). 


We now decompose or (e) into two parts, 


(a 


KO-1 My ABO =1 
a O=Ey > exe @Ma=y+E SD e*xE%R ss). ©) 
(e) 


n=0 n=k; 


Let us first rewrite the first term on the right hand side of Eq. (9). By conditioning 


(é) F 
on kK,’ we get, fori, s £0, 
KO] 


E>) ex (€(n) =5) 


n=0 
a eO-1 
= DEL > et xE@ =5) |e? =k) Pile)? =k 
k=1 n=0 
oo k-1 
= Ge) (= o) Pi{c( =k}. 
k=1 n=0 
It follows that 
KO] 
E; >> ef xE(n) =5) = 80, 5)9; (0), is £0, (10) 
n=0 
eae (11) 


) 
Eix\ 


where 
(e) = 
g; (p) = | (E;er*\” —1)/(e? —1) p £0. 


Let us now consider the second term on the right hand side of Eq. (9). By condi- 


tioning on (n\”, «\) we get, for i, j, s £0, 
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pe Abe — 1 


E >) e@xE) =s) 


__(e) 
n=k; 


Le Awe 1 


[o.@) 
=D DEL Do et xE@ =5) | nP =1, =k] OPH 
140,j k=1 nae 
66 pe Ame — 1 
=S> 5) YS 7%) = | OP w. 
140,j k=1 n=0 
It follows that 
ue Abo —1 
E exe? Man = > PP Oo, i js40. 12) 
n= =x 140, j 
1 


From (9), (10), and (12) we now get the following system of linear equations, 
oF(p) = 5G, 5)9; (—) + DS) Di (—a;Y(p), i. i, 8 £0. (13) 
140, j 


In order to write this system in matrix form, let us define the following column 
vectors, 


8 (p) = [81,9 (p) --- 8(N, GO (p)]" » 8 #0, (14) 
20) = [o.00) of, 0] 8 #0. (15) 


Using (7), (14), and (15), the system (13) can be written in the following matrix 
form, 


2 (p) = G(p) + jP (p)QO(p), j,8 #0. (16) 
We close this section with a lemma which will be important in what follows. 


Lemma 1 Assume that we for some ¢ => 0 and p € R have that gl >O0, i,k £0 
and ona) <o,i £0,k € X. Then, for any j 4 0, the following statements are 
equivalent: 


(a) ®7)(p) < 00. 
(b) 29 (p) < 00, s £0. 


(c) The inverse matrix I — Fae) exists. 
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Proof For each j 4 0, let us define a matrix valued function jA® (p) = llja a (p)| 
by the relation 


jAO(p) = 1+ jPO(p) + GPO (—)) +, DER. a7) 


Since each term on the right hand side of (17) is non-negative, it follows that 
the elements ja? (p) are well defined and take values in the set [0, oo]. Further- 
more, the elements can be written in the following form which gives a probabilistic 
interpretation, 


CO 
a OSE PPO “oy Au? Sn 1 eR beh 218) 
n=0 


Let us now show that 


©”) (p) = jA(p)p(p), pER, j £0. (19) 
In order to do this, first note that, for 7 4 0, 
x0? > =x (YP AvP an, =k Wi = J). — 20) 
n=0 k40 


Using (20) and the regenerative property of the semi-Markov process, the follow- 
ing is obtained, fori, j 4 0, 


#1 (0) = SD Bet (WP Av > =k n= J) aD 


n=0 k£0 


oa) 
_ ey Ect y (of A uP >n, 1 = k) Pi; (p)- 


n=0 k40 


From (18) and (21) we get 


25; (0) = >> jain (o) PK (0), i, | FO, (22) 
k#0 


and this proves (19). 
Let us now define 


oO 
of (0) = Sof(o) = DeP: {uP vu? >nk, peR, i740. (3) 
sA0 = 
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Ei(uy’ Au”) p =0, 


(€ (€ (24) 
(Ejero47) — 1) /(e? — 1) p £0. 


0) (p) = | 


Also notice that 


(e) 


(e) (e) (e) 
Eject Hy) = ByePl x (vy? > vO) + Bier? x0? <v©), if £0. (25) 


Using similar calculations as above, it can be shown that 
Ee x(V? < v) =D ja (pO (), ij #0. (26) 
k#0 
It follows from (22), (25), and (26) that 
Ee) = LS a0) (PP) + PO). LAO. OD 
k#0 


Let us now show that (a) implies (b). 
By iterating relation (8) we obtain, 


0 (p) = (1+ PO) +--+ (POW))") PW) (28) 


6 n+l € 
Po) OP), R= 1,2) 453 


Since ©" (p) < 00, it follows from (28) that 
€ n+l € 
(;P‘ (p)) ' (p) —> 0, asn > ~. (29) 


The assumptions of the lemma guarantee that 0" (e) > O. From this and relation 


(29) we can conclude that (Pe (p))"+! — 0, as n — oo. It is known that this holds 
if and only if the matrix series (17) converges in norms, that is, jA©(p) is finite. 
From this and relations (23), (24), and (27) it follows that (b) holds. 

Next we show that (b) implies (c). 

By summing over all s ¢ 0 in relation (16) it follows that 


2 (p) = 9 (p) + jP (p)Q}(p), p ER, (30) 


where T 
2 (p) = [oro ve ov}(0)] , J #0, 
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and e 
oe (p) =[9) (~~) --- g(~)] 


By iterating relation (30) we get 


2 (p) = I+ jPO(p) +++ + GP (0))") 9 (p) (31) 
+(,P©(p))""1QY (p), n=1,2,... 


It follows from (b) and the definition of oy (p) thatO < 2 (p) < oo. So, letting 
n — ooin(31) and using similar arguments as above, it follows that the matrix series 
(17) converges in norms. It is then known that the inverse matrix (I — Pa (p))7! 
exists, that is, (c) holds. 

Let us finally argue that (c) implies (a). 

If 1- jP®(p))"! exists, then the following relation holds, 


Ad — ,P©(p))* =I+ jP© (pe) — jP(p))*. (32) 
Iteration of (32) gives 


= PO)" =1+ PO) + GPO) +--+ GPO)" G3) 
GP (oy g(a) = La2aeves 


Letting n — oo in (33) it follows that ;A“ (p) = A — ;P(p))~! < oo. From (19) 
we now see that (a) holds. 


4 Convergence of Moment Functionals 


In this section it is shown that the solutions of the systems derived in Sect. 3 converge 
as the perturbation parameter tends to zero. In addition, we prove some properties 
for the solution of a characteristic equation. 

Let us define 


(e) 
«\;’(p) = EeP4i x0 Av > oe); pER, i,j,k eX. 


If the states {1,..., N}is acommunicating class and on (p) < 1 forsomei 4 0, 
then it can be shown (see, for example, [5]) that the following relation holds for all 


J #9, 


(1-4) (1-169) = (1-4) (1-6). G4) 
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Relation (34) is useful in order to prove various solidarity properties for semi- 
Markov processes. In particular, if 6 (p) = |, relation (34) reduces to 


(1-61) (1 j04@) =0. (35) 

From the regenerative property of the semi-Markov process it follows that 
9 (p) = 6) (0) + 0) (DO) (0). 7 OF. (36) 
Since {1,..., N}isacommunicating class, we have ; i; (ep) > Oand o\ (p) > 0. 


So, if ¢{ (p) = 1 it follows from (36) that ;{°(p) < 1. From this and (35) we can 


UL UL 


conclude that go (e) = | for all 7 A 0. Thus, we have the following lemma: 


Lemma 2 Assume that we for some ¢ > 0 have that ee > Oforallk, j 4 0. Then, 


if we for some i #0 and p € R, have that go (p) = I, it follows that go (p) = 1 
forall j #0. 


Let us now define the following characteristic equation, 
eo, w= (37) 


where i 4 0 is arbitrary. The root of Eq. (37) plays an important role for the asymp- 
totic behaviour of the corresponding semi-Markov process, see, for example, [6]. 

The following lemma gives limits of moment functionals and properties for the 
root of the characteristic equation. 


Lemma 3 /fconditions A-C hold, then there exists 6 € (0, 6] such that the following 
holds: 
() (0) > 4)(p) < 00, ase > 0, p <5, k, j £0. 
(ii) 45 (0) > of 
(iii) o\) (6) € (1, 00), j #0. 
(iv) For sufficiently small €, there exists a unique non-negative root p© of the 
characteristic equation (37) which does not depend on i. 


(p)< w,ase> 0,0 <56,k,j,5 #0. 


(v) p® > p© <dase> 0. 


Proof Let i 40 and 6; < B be the values given in condition C. It follows from 
conditions B and C that o (e) is a continuous and strictly increasing function for 


p < B;. Since a (0) = a < land o (B;) > 1, there exists a unique p’ € [0, B;) 
such that o” (p’) = 1. Moreover, by Lemma 2, 


Gi (O')=1, fj #0. (38) 
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For all j 4 0, we have 
0) (0) = 09 (0) + PR (OOG (0'), k #0, j. (39) 
It follows from (38) and (39), and condition B, that 
$4; (p') < 00, kj #0. (40) 


From (40) and Lemma | we get that det(I — jR© (p')) £0, for 7 # 0. Under 
condition C, the elements of I— ;P () are donnious functions for p < f. This 
implies that we for each j 4 0 ean find B; € (e', B;] such that det(I — jP(B;)) # 


0. By condition C we also have that Py; (Bi) < o fork £0, j € X. It now follows 


from Lemma | that ¢{;)(B;) < 00, k, j # 0. If we define 5 = min{f,,..., By}, it 
follows that 
0 . 
of; (p) < 00, p <5, kj £0. (41) 


Now, let p < 6 be fixed. Relation (41) and Lemma | imply that 


det(I — ;,P (p)) £0, j #0. (42) 
Note that we have 
oOo, > ol" fen), kj eX. (43) 
n=0 


Since i (n) are proper probability distributions, it follows from (43) and con- 
ditions A and C that 


Py (e) > Py’ (p) < 0, ase >0,k #0, jeX. (44) 


It follows from (42) and (44) that there exists ¢; > O such that we for all e < e 
have that det(I — 7P©(p)) # Oand p, (Pp) < oo, for all k, 7 4 0. Using Lemma | 


once again, it now follows that o,; (0) < oo, k, j #0, for all ¢ < €;. Moreover, in 
this case, the system of linear equations (8) has a unique solution for ¢ < ¢; given 
by 
&©) (0) = A ,P©(o))'pP'(p), J £0. (45) 
From (44) and (45) it follows that 
$4; (P) > 4y'(0) < 00, ase > 0, k, j #0. 


This completes the proof of part (i). 
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For the proof of part (ii) we first note that, since i; (p) < oofore < &,k, 7 £0, 


it follows from Lemma | that Oy (p) < o for € < &, k, j, s 4 0. From this, and 
arguments given above, we see that the system of linear equations given by relation 
(16) has a unique solution for ¢ < €; given by 


2 (0) = A- P©~)) GO), j,8 £0. (46) 


Now, since Ee"! = 0.x p{))(p), it follows from (11) and (44) that gf (p) > 
0 (ep) < coase > 0,i A 0. Using this and relations (44) and (46) we can conclude 
that part (ii) holds. 

By part (i) we have, in particular, \) (6) > oy) (6) < case > 0, forall j 4 0. 
Furthermore, since p’ < 6 and os) (p) is strictly increasing for p < 4, it follows from 


(38) that ¢\) (6) > 1, 7 #0. This proves part (iii). 

Let us now prove part (iv). 

It follows from (i) and (iii) that we can find ¢2 > 0 such that oi (6) € (1, oo), 
j #0, for all ¢ < €2. By conditions A and B there exists ¢3 > 0 such that, for 
each i 4 0 and ¢€ < 6¢3, the function rat (n) 1s not concentrated at zero. Thus, for 
every i £ 0 and € < min{é, €3}, we have that or ) (p) is a continuous and strictly 
increasing function for p € [0, 5]. Since 6 (0) = a < 1 and ¢\°(6) > 1, there 
exists a unique p\” € [0, 5) such that 6 (po!) = 1. By Lemma 2, the root of the 
characteristic equation does not depend oni so we can write p“ instead of o° . This 
proves part (iv). 

Finally, we show that p®) > p© ase > 0. 

Let y > 0 such that p + y <6 be arbitrary. Then 6 (0 — y) <1 and 
o (0 + y) > 1. From this and part (i) we get that there exists ¢4 > 0 such 
that 6 (o — y) <1 and 6 (po + y) > 1, for all ¢ < €4. So, it follows that 
lo — p| < y for e < min{e2, €3, ¢4}. This completes the proof of Lemma 3. 


5 Expansions of Moment Functionals 


In this section, asymptotic expansions for mixed power-exponential moment func- 
tionals are constructed. The main results are given by Theorems | and 2. 

Let us define the following mixed power-exponential moment functionals for 
distributions of first hitting times, 


(oe) 


$i: (p.1) = Sines (n), peR, r=0,1,..., 15,7 € X. 
n=0 


By definition, 6 (0, 0) = 64 (p). 
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We also define the following mixed power-exponential moment functionals for 
transition probabilities, 


(oe) 


Pi (p, r) = dine" O @), péeER, r=0,1,..., 1,7 €X. 
n=0 


By definition, P(e, 0) = Pi; (p). 
It follows from conditions A-C and Lemma 3 that, for p < 6 and sufficiently 
small ¢, the functions $\ (p) and pe (p) are arbitrarily many times differentiable 


with respect to p, and the derivatives of order r are given by o;; (o,r) and De (p,1r), 
respectively. 
Recall from Sect. 3 that the following system of linear equations holds, 


5; (0) = Pi’ (~) + > Pir’ (14); (0), i, | FO. (47) 
140, j 

Differentiating relation (47) gives 
¢? (oH =I2 C+ DY PP OOP Wr), TH=12 107A 4) 

1A0, j 

where 
HS a.0) = PiNa.r + (7) alte. mae.r =m, (49) 
140, j 


In order to write relations (47)-(49) in matrix form, let us define the following 
column vectors, 


T 
OP(0.7) = [OP (0. BN} (O.N)] 7 #0, (50) 
T 
po. = [PRON PHo.n] 1 #0. (51) 
AP. = [on Bon) . i 0. (52) 


Let us also, for 7 4 0, define N x N-matrices jP™ (0, r= pe (o, r)|| where 
the elements are given by 


(©) 
6) Py(Oni=l,....N kA ij, 
fon (es r= fi PS ey Ny RSH fe Os 
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Using (47)-(53) we can for any j 4 0 write the following recursive systems of 
linear equations, 
§)(p) = pi (p) + jP (0) 5 (p), (54) 


and, forr = 1,2,..., 
&)(p,r) = AY (p,r) + jP (p) ®(p, 1), (55) 


where 


€ & “ r € € 
AP (p.1) = PP (0.7) + > (*)n (p,m)®"(p,r—m). (56) 


m=1 


Let us now introduce the following perturbation condition, which is assumed to 
hold for some p < 6, where 6 is the parameter in Lemma 3: 


PE: pi (0,7) = pf (o.r) + pile. Ue +--+ + piyjlo.r, k —rle& + o(e), 
for r=0,...,k, i 40, j € X, where |p;j[e,r,n]| < oo, for r=0,...,k, 
n=1,...,k—-r,i~0,j eX. 


; 0 
For convenience, we denote pe. r) = pile.r, 90], forr =0,...,k. 

Note that if condition P; holds, then, for r = 0,...,k, we have the following 
asymptotic matrix expansions, 


7P'(@, r) = jPlp, r, 0] + jPlp, r, l]e + bam + jPlp, Ds k = rjeh" + o(e*"), 
(57) 
py (o,r) = pjlo.r, 0] + pylp.r, le +--+ pjlo.r,k — re’ + 0(e*”). 
(58) 
Here, and in what follows, o(eé”) denotes a matrix-valued function of ¢ where 
all elements are of order o(e”). The coefficients in (57) are N x N-matrices 
jPlo.r,n] = |ljp;,,Le.7, 2]|| with elements given by 


. _ | pulosrnli=l,....N, k Aj, 
jPixlP. 1 = | 9 i=1,...,N, k=j, 


and the coefficients in (58) are column vectors defined by 


pjlo.r.n] =[pijlo.rn) <-> pyjle.r.nl]” 


Let us now define the following matrix, which will play an important role in what 
follows, 


jU(p) = C- PO). 


Under conditions A-C, it follows from Lemmas | and 3 that ju” (p) is well defined 
for o < 6 and sufficiently small e. 
The following lemma gives an asymptotic expansion for ju® (p). 


124 M. Petersson 


Lemma 4 Assume that conditions A-C and Py hold. Then we have the following 
asymptotic expansion, 


jU(p) = jUlo, 01+ jUlp, le +--+ + jUlp, Meé + ote), (59) 
where 
= PO (p))! n=. 
;U Fy = / n 60 
iVle1= 1 utp, 01>, Plo, 0, qljUlp,.n—gln=1,...,k. © 


Proof As already mentioned above, conditions A—C ensure us that the inverse 
ju® (p) exists for sufficiently small ¢. In this case, it is known that the expan- 
sion (59) exists under condition PX. To see that the coefficients are given by (60), 
first note that 


T= 0=P"(p) Uo) (61) 
= (I— ;P(p) — Plo, 0, Ne —--- — jPLp, 0, kle* + o(e*)) 
x (jU[p, 0] + jUlp, le +--+ + jUlp, ke + o(e*)). 


By first expanding both sides of Eq.(61) and then, for n = 0, 1,...,k, equating 
coefficients of e” in the left and right hand sides, we get formula (60). 


We are now ready to construct asymptotic expansions for ©" (p, r). 
Theorem 1 Assume that conditions A-C and P¥ hold. Then: 


(i) We have the following asymptotic expansion, 


©") (p) = Oj[0, 0,0] + ®j[0, 0, lle +++. + ®)[p, 0, kle* + o(e4), 


where 
©) (p) n=0 
®[p,0,n] = / : 
4 7-0 /ULe, qipjle,0,n — q] | — i ee 
(ii) Forr =1,...,k, we have the following asymptotic expansions, 


& (p,r) = &jlp,r, 0] + Ojlp,r, He +++ + & jlo, rk — re" + o(6k"), 
where 


&/(p,1) n=0, 


®,; ; = n 
ile.7 n) Dug=0 /ULe, q]Ajle, 7,2 -— gq] n=1,...,k-1, 
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and, fort =0,...,k —7r, 


Aj[p, rtl= pile, r,t] +>’ ) SsPlo.m.g1® jLovr mt = ah 


m=1 q=0 


Before proceeding with the proof of Theorem | we would like to comment on the 
reason that the theorem is stated in such a way that ®; @) (p,r),forr=1,...,k, has 
an expansion of order k — r. The reason is that this 1 ic exactly what we reed for the 
main result in [6], which, we remind, is a sequel of the present paper. However, it is 
possible to construct asymptotic expansions of different orders than the ones stated 
in the theorem. In that case, appropriate changes in the perturbation condition should 
be made. The same remark applies to Lemma 5 and Theorem 2. 


Proof Under conditions A—-C, we have, for sufficiently small ¢, that the recursive 
systems of linear equations given by relations (54)—(56), all have finite components. 
Moreover, the inverse matrix jU®(p) = (I- Fl (p))~! exists, so these systems 
have unique solutions. 

It follows from (54), Lemma 4, and condition P¥ that 


&))(p) = jU ()p}'(p) (62) 
= (jUlp, 0] + jUlp, lle + +--+ jUlp, kle* + o(e)) 
x(pjle, 0,0] + pyle, 0, He +---+ pyle, 0, kle* + o(¢')). 


By expanding the right hand side of Eq. (62), we see that part (i) of Theorem | 
holds. 
With r = 1, relation (56) takes the form 


A (p, 1) = pp, 1) + jPO(p, DO” (p). (63) 
From (63), condition Pf, and part (i), we get 


AY (p, 1) = pyle, 1, 0] +--+ pjlo, 1k — Ue + 0(e!) (64) 
+(;Plo, 1,0] +---+ jPlo, 1, — Ne*" + ofe*"’)) 
x(®j[p, 0,0] +--- + O)[o,0,k — He! + of"). 
Expanding the right hand side of (64) gives 


A (p, 1) = Ajlo, 1,0] + Ajlo, 1, He +-+- + Ajle, 1k — Het! + o(e, 
(65) 
where 


RY 
Il 
oO 
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It now follows from (55), (65), and Lemma 4 that 


&(p, 1) = jU (p) A, D (66) 
= (jUlp, 0] + +--+ jUlp, k — Ne*! + ofe*1)) 
x(A,[e, 1,0] +--»+ AjLe, 1,4 — He*! + o(e*!)). 


By expanding the right hand side of Eq. (66) we get the expansion in part (ii) for 
r = 1. If k = 1, this concludes the proof. If k > 2, we can repeat the steps above, 
successively, for r = 2,...,k. This gives the expansions and formulas given in part 


(ii). 


Let us now define the following mixed power exponential moment functionals, 
fori, j,s € X, 


[o.e) 
OD, a — Done P (EON) = 75; us A a >nj}, p€R,r=0,1,.... 
n=0 
Notice that @\;)(p, 0) = wf;}(p). 
It follows from conditions A—C and Lemma 3 that for p < 6 and sufficiently 


small ¢, the functions Or (p) and Pi; (p) are arbitrarily many times differen- 


tiable with respect to p, and the derivatives of order r are given by or (p,1r) 
and BS (e, 1), respectively. Under these conditions we also have that the functions 
gy,” (p), defined by Eq. (11), are differentiable. Let us denote the corresponding deriv- 


atives by gy (p, r). 
Recall from Sect. 3 that the functions Oy 


equations, 


(pe) satisfy the following system of linear 


oy (p) = 5, 8)9, (0) + > pip (po (p), i, j,8 ZO. (67) 
140, j 


Differentiating relation (67) gives 


oF)(0. 7) = O52 (0,r) + >) py’ (p)@2 (0,7), 7 =1,2,..., 55,560, (68) 
140, j 


where 
é . & ’ r S e. 
A? (0.r) = 8G, 8) (p.r) + >- (") DY P(e. ma(p.r—m). (69) 
m=1 140, j 


In order to rewrite these systems in matrix form, we define the following column 
vectors, 


Asymptotic Expansions for Moment Functionals ... 127 


(e) (e) . : 
2°(p,r) = [oRe. r)- - of),(0,7)| js £0, (70) 
00.7) =A.) O(0.n] . is #0. (71) 


9 (0,7) = [50 9.7) + BIN, S)PY'(O. |. 840. (72) 


Using (53) and (67)-(72), we can for each j, s 4 0 write the following recursive 
systems of linear equations, 


2 (0) = GO (p) + jP (p)Q (p), (73) 
and, forr = 1,2,..., 
2") (p, 7) = OO (p, r) + jP (o)Q") (0, 1), (74) 


where 


&€ E€ “ r &€ 4 
OPO. =WoOnN+> (yo (p,m) Q°)(p,r—m). (75) 
m=1 


In order to construct asymptotic expansions for the vectors ae (p,1r), we can use 
the same technique as in Theorem |. However, a preliminary step needed in this case 
is to construct asymptotic expansions for the functions og (e, r). In order to do this, 
we first derive an expression for these functions. 

Let us define 


[ee 
HPD.) = DnePile =n}, PER T=0,1...,1€X. (76) 


n=0 


Note that 


VO (0.r) => piP(p.r), 0ER, r =0,1,...,1€X. (77) 


jEx 


Thus, the functions vu (e, 0) are arbitrarily many times differentiable with respect 
to ep and the corresponding derivatives are given by Vo (p,1r). 
The function go (p), defined by Eq. (11), can be written as 


00, 1 p =0, 


(e) = 
Q; (P) = (Wi (p, 0) — 1I)/(e? — 1) p £0. 


(78) 


From (76) and (78) it follows that 
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(0,0) = (e? — Nes (r) +1, PER. (79) 
Differentiating both sides of (79) gives 


r—1 


WO (p.r) =e? — De (p+e? > ("oom P= lice (80) 
m 


m=0 


If p = 0, Eq. (80) implies 


r—2 
WP Or) =rePOr-)+ >) (")a. m), r= 2,3,.... 
m 
m=0 


From this it follows that, forr = 1,2,..., 


(e) _ _! (e) _ > (" ate ‘) () 
Q; on= L(y (0,r + 1) 2 m )% (0, m) }. (81) 


If p € 0, Eq. (80) gives, forr = 1,2,..., 


1 r-1 
P0.N= TH (ve. =e (;)e"e.m). (82) 


m=0 


Using relations (77), (81), and (82), we can recursively calculate the derivatives 
of og (e). Furthermore, it follows directly from these formulas that we can construct 
asymptotic expansions for these derivatives. The formulas are given in the following 
lemma. 


Lemma 5 Assume that conditions A—-C hold. 


(i) Jf in addition, condition Py holds, then for each i £0 andr =0,...,k we 
have the following asymptotic expansion, 


Vi (pr) = Wile, 7 01+ Wilo.r, Ne +--+ wiles k —rleh + o(e8), 
where 
wWilo.r, n] =>) pijlo.r.nl, n =0,...,k—-91r. 


jex 


(ii) If, in addition, p = 0 and condition PX, holds, then for each i # 0 andr = 
0,...,k we have the following asymptotic expansion, 


go! (0,r) = gil0, r, 0] + gi[0, r, He +--+ + gi[0, 7, k — re + o(e), 


where, forn =0,...,k —r, 
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1 Sel 
i[0, r, n] = ——— [0,r+1,n] — i[0, m,n] }. 
gil0, rn] oi X( det 


(iti) If in addition, p 4 0 and condition PX holds, then for each i #0 and r = 
0,...,k we have the following asymptotic expansion, 


9; (p.r) = gilp.r. 01+ gilp.r, Ne +--+ + gil. rk — re + oe"), 
where, forn =0,...,k —Yr, 
1 rir 
aosrm= 1 (wile ()oo.mnn) 


Using (72) and Lemma 5 we can now construct the following asymptotic expan- 
sions, forr = 0,...,k, ands 4 0, 


(0.7) = Glo, 7, 1+ GLP, 7, Me +--+ G.L0, 7k — ret + of"). 
(83) 
The next lemma gives asymptotic expansions for a (p,r). 


Theorem 2 Assume that conditions A-C hold. If p = 0, we also assume that con- 
dition Py, holds. If p # 0, we also assume that condition PX holds. Then: 


(i) We have the following asymptotic expansion, 


2) (p) = Qs, 0, 0] + Qjslo, 0, Ne +--+ + Qislo, 0, kle* + ofe*), 


where 
Qi (0) n=0 
Qjs[P, 0,2] = | Sn a 
ne Yo ULp, 1,0, 0,n —qln=1,...,k. 
(ii) Forr =1,...,k, we have the following asymptotic expansions, 


2! (p,r) = Qjslp,r, 0] + Qjslo, 7, He +--+ + Qjslo.rk—rlek + o(6%"), 
where 


Qi) (0,1) n=0, 


AY 


Q js fy —_ n 
slo nml=1 58 | Ul, ql@jslo.r.2—g]n=1,....k=7, 


and, fort =0,...,k —7r, 
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Ojslo.7, t] = @,[p.1r, n+>( 


m=1 


t 

7 

) > Plo, m, q1Qislo,r — mt — ql. 
m 

q=0 
Proof Under conditions A—C, we have, for sufficiently small ¢, that the recursive 
systems of linear equations given by relations (73)—-(75), all have finite components. 
Moreover, the inverse matrix jU®(p) = (I- jP™ (p))~! exists, so these systems 
have unique solutions. Since we, by Lemma 5, have the expansions given in Eq. (83), 
the proof is from this point analogous to the proof of Theorem 1. 
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Asymptotics for Quasi-stationary 
Distributions of Perturbed Discrete Time 
Semi-Markov Processes 


Mikael Petersson 


Abstract In this paper we study quasi-stationary distributions of non-linearly per- 
turbed semi-Markov processes in discrete time. This type of distributions are of inter- 
est for analysis of stochastic systems which have finite lifetimes but are expected to 
persist for a long time. We obtain asymptotic power series expansions for quasi- 
stationary distributions and it is shown how the coefficients in these expansions can 
be computed from a recursive algorithm. As an illustration of this algorithm, we 
present a numerical example for a discrete time Markov chain. 


Keywords Semi-Markov process - Perturbation - Quasi-stationary distribution - 
Asymptotic expansion - Renewal equation - Markov chain 


1 Introduction 


This paper is a sequel of [22] where recursive algorithms for computing asymp- 
totic expansions of moment functionals for non-linearly perturbed semi-Markov 
processes in discrete time are presented. Here, these expansions play a fundamen- 
tal role for constructing asymptotic expansions of quasi-stationary distributions for 
such processes. Let us remark that all notation, conditions, and key results which we 
need here are repeated. However, some extensive formulas needed for computation 
of coefficients in certain asymptotic expansions are not repeated. Thus, the present 
paper is essentially self-contained. 

Quasi-stationary distributions are useful for studies of stochastic systems with 
random lifetimes. Usually, for such systems, the evolution of some quantity of interest 
is described by some stochastic process and the lifetime of the system is the first time 
this process hits some absorbing subset of the state space. For such processes, the 
stationary distribution will be concentrated on this absorbing subset. However, if we 
expect that the system will persist for a long time, the stationary distribution may 
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not be an appropriate measure for describing the long time behaviour of the process. 
Instead, it might be more relevant to consider so-called quasi-stationary distributions. 
This type of distributions is obtained by taking limits of transition probabilities which 
are conditioned on the event that the process has not yet been absorbed. 

Models of the type described above arise in many areas of applications such 
as epidemics, genetics, population dynamics, queuing theory, reliability, and risk 
theory. For example, in population dynamics models the number of individuals may 
be modelled by some stochastic process and we can consider the extinction time 
of the population as the lifetime. In epidemic models, the process may describe the 
evolution of the number of infected individuals and we can regard the end of the 
epidemic as the lifetime. 

We consider, for every ¢ > 0, a discrete time semi-Markov process & )(n), 
n=0,1,..., onafinite state space X = {0,1,..., N}.Itis assumed that the process 
&©(n) depends on ¢ in such a way that its transition probabilities are functions of 
é which converge pointwise to the transition probabilities for the limiting process 
£(n). Thus, we can interpret &)(n), for ¢ > 0, as a perturbation of € (n). Fur- 
thermore, it is assumed that the states {1,..., N} is a communicating class for ¢ 
small enough. 

Under conditions mentioned above, some additional assumptions of finite expo- 
nential moments of distributions of transition times, and a condition which guarantees 
that the limiting semi-Markov process is non-periodic, a unique quasi-stationary dis- 
tribution, independent of the initial state, can be defined for each sufficiently small 
é by the following relation, 


= = lim P; {eo =jlug > nf. i,j #0, 


noo 


where ue is the first hitting time of state 0. 

In the present paper, we are interested in the asymptotic behaviour of the quasi- 
stationary distribution as the perturbation parameter ¢ tends to zero. Specifically, an 
asymptotic power series expansion of the quasi-stationary distribution is constructed. 

We allow for nonlinear perturbations, i.e., the transition probabilities may be 
nonlinear functions of ¢. We do, however, restrict our consideration to smooth per- 
turbations by assuming that certain mixed power-exponential moment functionals 
for transition probabilities, up to some order k, can be expanded in asymptotic power 
series with respect to «. 

In this case, we show that the quasi-stationary distribution has the following 
asymptotic expansion, 


mO =n +nj[lle+---+7)[kle* + ofc), j £0, (1) 
where the coefficients 7 j[1],..., 2;[k], can be calculated from explicit recursive 


formulas. These formulas are functions of the coefficients in the expansions of the 
moment functionals mentioned above. The existence of the expansion (1) and the 
algorithm for computing the coefficients in this expansion is the main result of this 


paper. 
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It is worth mentioning that the asymptotic relation given by Eq. (1) simultane- 
ously cover three different cases. In the simplest case, there exists ¢9 > 0 such that 
transitions to state 0 are not possible for any ¢ € [0, &p]. In this case, relation (1) gives 
asymptotic expansions for stationary distributions. Then, we have an intermediate 
case where transitions to state 0 are possible for all ¢ € (0, €9] but not possible for 
é = 0. In this case we have that pe — oo in probability as ¢ — 0. In the math- 
ematically most difficult case, we have that transitions to state 0 are possible for 
all e € [0, €g]. In this case, the random variables a are stochastically bounded as 
e—>0. 

The expansion (1) is given for continuous time semi-Markov processes in [13, 14]. 
However, the discrete time case is interesting in its own right and deserves a special 
treatment. In particular, a discrete time model is often a natural choice in applications 
where measures of some quantity of interest are only available at given time points, 
for example days or months. The proof of the result for the continuous time case, 
as well as the proofs in the present paper, is based on the theory of non-linearly 
perturbed renewal equations. For results related to continuous time in this line of 
research, we refer to the comprehensive book [14], which also contains an extensive 
bibliography of work in related areas. The corresponding theory for discrete time 
renewal equations has been developed in [9, 12, 19-21, 25]. 

Quasi-stationary distributions have been studied extensively since the 1960s. For 
some of the early works on Markov chains and semi-Markov processes, see, for 
example, [4, 5, 7, 10, 16, 24, 30]. A survey of quasi-stationary distributions for 
models with discrete state spaces and more references can be found in [29]. 

Studies of asymptotic properties for first hitting times, stationary distributions, 
and other characteristics for Markov chains with linear, polynomial, and analytic 
perturbations have attracted a lot of attention, see, for example, [1-3, 6, 8, 11, 15, 
17, 18, 23, 27, 28, 31, 32]. Recently, some of the results of these papers have 
been extended to non-linearly perturbed semi-Markov processes. Using a method of 
sequential phase space reduction, asymptotic expansions for expected first hitting 
times and stationary distributions are given in [26]. This paper also contains an 
extensive bibliography. 

Let us now briefly comment on the structure of the present paper. In Sect. 2, most 
of the notation we need are introduced and the main result is formulated. We apply 
the discrete time renewal theorem in order to get a formula for the quasi-stationary 
distribution in Sect.3 and then the proof of the main result is presented in Sect. 4. 
Finally, in Sect. 5, we illustrate the results in the special case of discrete time Markov 
chains. 


2 Main Result 


In this section we first introduce most of the notation that will be used in the present 
paper and then we formulate the main result. 
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For each e > 0, let &©) (n),n =0,1,..., be a discrete time semi-Markov process 
on the state space X = {0, 1,..., N}, generated by the discrete time Markov renewal 
process (n®, K®), n=0,1,..., having state space X x {1, 2, ...} and transition 
probabilities 


OP a eS ee Si St Bia Bist ee 


We can write the transition probabilities as Oe (k) = pe i (k), where pi, are 
transition probabilities for the embedded Markov chain n° and 


fOOSP Ste Sha aT Sle hes, 


are conditional distributions of transition times. 

Let us here remark that definitions of discrete time semi-Markov processes and 
Markov renewal processes can be found in, for example, [22]. 

For each j € X, let Ee =min{n > 1: n® = j} and cs = KP ped Kor 
By definition, vf ‘ ) and we are the first hitting times of state j for the embedded 
Markov chain and the semi-Markov process, respectively. 

In what follows, we use P; and E; to denote probabilities and expectations con- 
ditioned on the event {7° = i}. 

Let us define 


8, (a) = Pi { Py ue eal n=0,1,..., i,j € X, 


and 
a = =P, [us > | Ji jE X. 


The functions gi; (n) define discrete probability distributions which may be improper, 


(e) (e) 
Le, ne (n)=8;, <1. 
Let us also define the following mixed power-exponential moment functionals, 


(oe) 


Di (p, r)= Si wer" oo), peER, r=0,1,..., i,j, € X, 


n=0 


(oe) 


(p17) — Sieg), pEeR, r=0,1,..., 1,7 € X, 
n=0 


[oe] 


O(p.1) = >i neh (n), 0 ER, r=0,1,..., i, 7,59 €X, 
n=0 
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where he (n) =Pi{EOM) = 5, WO A a > n}. For convenience, we denote 


Pe (P) = P(0.0), O(p) = 60,0), @f)(p) = o)(p, 0). 


We now introduce the following conditions: 


A: (a) pe > pi), ase > 0, i Z#0,j EX. 


(b) f{(n) > fn), ase > 0,n=1,2,...,1 £0, 7 EX. 
_ -@) a : 
B: g;, >0,1,j #0. 
C: There exists 6 > 0 such that: 
(a) lim supy<, 9 P\;’(B) < 00, for alli #0, j € X. 
(b) $°(B;) € (1, 00), for some i # 0 and ; < B. 


D: a (n) is a non-periodic distribution for some i 4 0. 


Under the conditions stated above, there exists, for sufficiently small ¢, so-called 
quasi-stationary distributions, which are independent of the initial state i A 0, and 
given by the relation 


ri) = lim P; {8m = flu? > nl, 7 £0. (2) 


An important role for the quasi-stationary distribution is played by the following 
characteristic equation, 


6 (p) = 1, (3) 


where i 0 is arbitrary. 
The following lemma summarizes some important properties for the root of 
Eq. (3). A proof is given in [22]. 


Lemma 9.1. Under conditions A-C there exists, for sufficiently small e, a unique 
non-negative solution p© of the characteristic equation (3) which is independent of 
i. Moreover, p® — p©, ase > 0. 


In order to construct an asymptotic expansion for the quasi-stationary distribution, 
we need a perturbation condition for the transition probabilities OF (k) which is 
stronger than A. This condition is formulated in terms of the moment functionals 
Pi; (er). 


Py: pi (pr) = pe (p .r) + pile. 7, Me + +++ + pile srk — re + 
ole"), for r =0,...,k 1 £0, 7 © X, where |pijlo, rn]| <o0, for r = 
0,...,k,n=1,...,k -r,if0, jf eX. 


The following theorem is the main result of this paper. The proof is given in 
Sect. 4. 
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Theorem 9.1 Jf conditions A-D and P41 hold, then we have the following asymp- 
totic expansion, 


aS a + mj[l]e+---+a;[kle* + o(e), 7 £0, 


where zj[n],n =1,...,k, j 40, can be calculated from a recursive algorithm 
which is described in Sect. 4. 


3 Quasi-stationary Distributions 


In this section we use renewal theory in order to get a formula for the quasi-stationary 
distribution. 

The probabilities P‘”(n) = Pi{E(n) = j, wy” > n}, i, j A 0, satisfy the fol- 
lowing discrete time renewal equation, 


PY (n) = hy? (n) + >) POO —be®&, n=0,1,..., (4) 
k=0 


where 
- PE (n) = jf, wg? A Wy? > n}. 


Since ¥ 58), (n) = a < 1, relation (4) defines a possibly improper renewal 
equation. 

Let us now, for eachn = 0, 1, ..., multiply both sides of (4) by ern where p® 
is the root of the characteristic equation 6 (p) = |. Then, we get 


BO (ny = HP (n) + BP a — FP, w= O.A. (5) 
k=0 


where 
x! (e) ( ) a ) (e) a (e) 
Pe (n) = e? ss oe (n), be (n) = e? "nai (n), a. (n) =e? "g(n ). 


By the definition of the root of the characteristic equation, relation (5) defines a 
proper renewal equation. 

In order to prove our next result, we first formulate an auxiliary lemma. A proof 
can be found in [22]. 


Lemma 9.2. Assume that conditions A—C hold. Then there exists 8 > p© such that: 


(3 (0) > $4) (p) < 00, ase > 0,9 <5, k, j £0. 


(ii) o 


js (P) > oP) <mw,ase>0,p <6,k,j,s #0. 
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We can now use the classical discrete time renewal theorem in order to get a 
formula for the quasi-stationary distribution. 


Lemma 9.3 Assume that conditions A—D hold. Then: 


(i) For sufficiently small ¢, the quasi stationary distribution nS ), given by relation 
(2), have the following representation, 


w®)( (6)) 

(e) _ Vij p og 

I, , 1,7 £0. (6) 
1 (0) + $f (OO) 


(ii) For j =1,...,N, we have 


a > ce ase —> 0. 

Proof Under condition D, the functions g;; ») (n) are non-periodic for all i 4 0. By 
Lemma 9.2 we have that 6 (p) > 6. (p) as € > 0, for p < 6,i #0. From this 
it follows that ge (n) > a (n) ase — 0, forn > 0,i 4 0. Thus, we can conclude 
that there exists ¢; > 0 such that the functions g, ae he (n), i 0, are non-periodic for 
alle < e€. 

Now choose y such that p < y <6. Using Lemmas 9.1 and 9.2, we get the 
following for alli 4 0, 


lim sup 3 ng;; ©) (n) < lim sup 3 ne’" g (n) 


0<e—>0 0 O<e>0 — 
n= =0 


< (sep ner) (8) < 00. 


n>0 


Thus, there exists €2 > 0 such that the distributions a(n), i 4 0, have finite mean 
for all € < &. 
Furthermore, it follows from Lemmas 9.1 and 9.2 that, for alli, 7 4 0, 


lim a (n) < lim cup 37 h(n) = = oy) (5) < ©, 


O<e>0 4=0 0<e>0 n=0 


so there exists ¢; > 0 such that °° , ed (n) < &, i, j £0, forall e < ¢3. 
Now, let ¢9 = min{é1, €2, €3}. For all ¢ < &g, the assumptions of the discrete time 
renewal theorem are satisfied for the renewal equation defined by (5). This yields 


Ae 
a j (K) 
PO) > Diao asn > 00, i,j #0, € < &p. (7) 


bbe ROW 
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Note that we have 


Pn) 
Pi{E©(n) = j| ui? > n} = —4  ——, n =0,1,..., 1,7 40. 08) 
yi PR) 


It follows from (7) and (8) that, for ¢ < &9, 


05; (p) 
P(E) = fluy >n> ma ,asn > oo, i, j £0. 
Dai Pin (0) 


This proves part (i). 
For the proof of part (ii), first note that, 


CO 
0 <limsup )* e?"h( (n) (9) 


O0<e>0 n=N 


oe) 
< lim sup > eh (n) 


O<e>0 


n=N 
gO oO) 0s NS 12 cing AU, 
Relation (9) implies that 
CO 

lim li Pann) = 0, i,j #0. 1 

Hi ences BOO hd AO ” 
It follows from Lemma 9.1 that 

p™ > p©, ase 0. (11) 

Since hi) (n), for each n = 0, 1,..., can be written as a finite sum where each 


term in the sum is a continuous function of the quantities given in condition A, we 
have 
h(n) > hi (n), ase > 0, i, j #0. (12) 


It now follows from (10)—(12) that 


oy (p) > af) (p), ase > 0, i, j #0. (13) 


Relations (6) and (13) show that part (ii) of Lemma 9.3 holds. 
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4 Proof of the Main Result 


In this section we prove Theorem 9.1. 

Throughout this section, it is assumed that conditions A—D and P,1 hold. 

The proof is given in a sequence of lemmas. For the proof of the first lemma, we 
refer to [22]. 


Lemma 9.4 Forr = 0,...,k andi, j 4 Owe have the following asymptotic expan- 
SIONS, 


oy} (pr) = aijlr, 0) + alr, Ne +--+ + aijlr,k — re + o(ek), (14) 


¢2 (or) = bilr, 0] + bilr, He +--+ + bir, k —rje*" 4 oe"), (15) 


where the coefficients in these expansions can be calculated from lemmas and theo- 
rems given in [22]. 


Let us now recall from Sect. 3 that the quasi-stationary distribution, for sufficiently 
small ¢, has the following representation, 


(©) / W(e) 
i er) j=l... N. (16) 
1 Wf (p©) +--+ 0} (p) 


The construction of the asymptotic expansion for the quasi-stationary distribution 
will be realized in three steps. First, we use the coefficients in the expansions given by 
(15) to build an asymptotic expansion for p, the root of the characteristic equation. 
Then, the coefficients in this expansion and the coefficients in the expansions given 
by (14) are used to construct asymptotic expansions for @; (0). Finally, relation 
(16) is used to complete the proof. 

We formulate these steps in the following three lemmas. Let us here remark that 
the proof of Lemma 9.5 is given in [25] in the context of general discrete time renewal 
equations and the proofs of Lemmas 9.6 and 9.7 are given in [20] in the context of 
quasi-stationary distributions for discrete time regenerative processes. In order to 
make the paper more self-contained, we also give the proofs here, in slightly reduced 
forms. 


Lemma 9.5 The root of the characteristic equation has the following asymptotic 
expansion, 
pO = pO + ee t+ +ee* + o€e'), 


where c; = —b,[0, 1]/b;[1, 0] and, forn = 2,...,k, 
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1 
= bi[0,n]+ >  bi[l,n— ale 
biL1, 0] 2. , 
n n q-l cP 
+> Dbilmn-a- DL TTS 
m=2 q=m A yeNg-1€Dmg D=1 Pp 


where Dy,q is the set of all non-negative integer solutions of the system 
ny tes++Ng-1=m, ny +2n2+---+(¢—1)ng-1 = 4. 


Proof Let A® = p© — p It follows from the Taylor expansion of the exponential 
function that, forn = 0,1,..., 


k ; 

" On (A®)'n" (A®)M1y k+l ict 

ohn — oh” ( > _ + sD! gan rent) ; (17) 
=O ! ! 


where 0 < ¢°)(n) < 1. 
If we multiply both sides of (17) by g(n), sum over all n, and use that p is 
the root of the characteristic equation, we get 


k 


(A®)" ; : : 
= Dt? (0.1) + AO IMED, (18) 
r=0 . 
where - 
é i 0 nv (é € 
Men = Gey Da ee a gi (0). (19) 
* n=0 


Let 5 > p be the value from Lemma 9.2. It follows from Lemma 9.1 that 
|A®| + Oas e > 0, so there exist 8B > 0 and ¢;() > O such that 


p+ (AM) 6 <8, eS ex(8). (20) 
Since 6B < 6, Lemma 9.2 implies that there exists ¢2(8) > 0 such that 
6: (B,r) < 00, r=0,1,..., € < &2(8). (21) 


Let ¢9 = €0(8) = min{e;(f), €2(8)}. Then, relations (19)—-(21) imply that 


1 
(e) (é) 
My, XS ka piv (B,k +1) <w, <6. (22) 
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It follows from (22) that we can rewrite (18) as 


k 


(AM) é é € 
1= > 5 0° (p, r) + (A! ye CO, (23) 
r=0 


where M;4) = SUP, <e, Me < coand0< eT <i. 
From relation (23) we can successively construct the asymptotic expansion for 


the root of the characteristic equation. 
Let us first assume that k = 1. In this case (23) implies that 


1 = $1) (0, 0) + APB? (0, I + (APY OU). (24) 
Using (15), (24), and that A® — 0 as ¢ > O, it follows that 
— bi[0, le = AM Gj[1, 0] + o(1)) + ofe). (25) 
Dividing both sides of Eq. (25) by ¢ and letting ¢ tend to zero, we can conclude 
that A® /e — —b;[0, 1]/b;[1, 0] as ¢ > 0. From this it follows that we have the 


representation 
A® =cye+ A, (26) 


where c; = —b;]0, 1]/b;[1, 0] and AY fe —> Oase— 0. 
This proves Lemma 9.5 for the case k = 1. 
Let us now assume that k = 2. In this case relation (23) implies that 


A) 2 


Using (15) and (26) in relation (27) and rearranging gives 
b;[2, O]e? ' 
2 (100 2) + bill, Ver + ead e = A® (iL, 0] + 0(1)) + 062). (28) 


Dividing both sides of Eq. (28) by ¢? and letting ¢ tend to zero, we can conclude 
that AP ye — cas © — 0, where 


1 
ee (10 2]+ bill, Ler + 


b;[2, art) 
b;[1, 0] 


2 


From this and (26) it follows that we have the representation 
A® =cyéet+ C2€? + Ay, 


where A‘? /e7 > Oas e — 0. 


142 M. Petersson 


This proves Lemma 9.5 for the case k = 2. 

Continuing in this way we can prove the lemma for any positive integer k. How- 
ever, once it is known that the expansion exists, the coefficients can be obtained in a 
simpler way. From (15) and (23) we get the following formal equation, 


— Gj[0, Ne + ;[0, 2]Je* +++) 29) 
= (cre + ene? +--+ )(bi[1, 0] + Bill, Le +--+) 
+ /2i (cre + ere” + +++)? (bi[2, 0] + Bi, He +-+-) +e. 


By expanding the right hand side of (29) and then equating coefficients of equal 


powers of ¢ in the left and right hand sides, we obtain the formulas given in Lemma 
oS. 


Lemma 9.6 For any i, j 4 0, we have the following asymptotic expansion, 
5) (p) = of} (p) + di[le + +++ + dijlkle* + o(e), 
where djj{1] = a;;[0, 1] + a;;[1, Oley, and, forn = 2,...,k, 


q-1 Rs 


n n n 
djj(n] = aj; [0,n] + Dd ajlln — q\eq + pa = ajj{m,n—q]- ys 


q=l m=2q=m Ny,-Ng—1€Dm,q P=1 


np!’ 


where Dyyq is the set of all non-negative integer solutions of the system 
My tess +Ng-1 =m, ny +2n2+---+(q—1)ng-1 = 4. 


Proof Let us again use relation (17) given in the proof of Lemma 9.5. Multiplying 
both sides of (17) by hy’ (n) and summing over all n we get 


(e) : (A®)? (e) 0 (e)\k+1 qy7(e) 
ij (0) = pz a a (0 ,r) + (APY ME, (30) 
r=0 : 


where 


aha 1 es 0. € 
(e) _ > k+1 (pO +|A® [np &) (e) 
M,,, = (k +1)! 2." el "Se (n)h;; (n). 
Using similar arguments as in the proof of Lemma 9.5 we can rewrite (30) as 


k 
(Aey 2 
35 (0) = YF} (00.1) + (AON Me oe (31) 
r=0 : 


(©) 


where M;,.) = sup, <,, Mp4, < &, for some €) > 0, and 0 < C <i. 
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From Lemma 9.5 we have the following asymptotic expansion, 
A® = cye+--- + ce" + o(e*). (32) 
Substituting the expansions (14) and (32) into relation (31) yields 


oe \S oy) (0) +.a;[0, le +--+ + a;j[0, kle* + o(e*) (33) 
H(cye +--+ + eX + o(e*)) 

x(ajj[1, 0] + aijf1, lle +--+ + a;j[1, k — We"! + of) 

+(1/k) (cre + +++ + cxe* + o(e))*(aijlk, 0] + 0(1)). 


By expanding the right hand side of (33) and grouping coefficients of equal powers 
of € we get the expansions and formulas given in Lemma 9.6. 


Lemma 9.7 For any j 4 0, we have the following asymptotic expansion, 


ro =n +nj[lle+--- + mlkle* + ofc"). (34) 


The coefficients mj[n],n = 1,...,k, j 40, are for anyi F 0 given by the following 
recursive formulas, 


where [0] = 2)”, dij[0] = of)? (0), and ej[n] = > j 49 dijln], n =0,...,k 


Proof It follows from formula (16) and Lemma 9.6 that we for all i, 7 4 0 have 


«) _ 4[0] + dj[lle +--+. + dijlkle* + o(e*) 
nO = 


i ef0] + elle +--+: + ej[kle* + o(e*) * — 


Since e;[0] > 0, it follows from (35) that the expansion (34) exists. From this and 
(35) we get the following equation, 


(e[0] + e[l]le + --- + e[k]e* + o(e*)) (36) 
x (2 j[0] + w[le + +--+ 2,[kle* + o(e*)) 
= d;j(0] + d;[lle +--- + d)[kle* + o(e*). 


By expanding the left hand side of (36) and then equating coefficients of equal 
powers of « in the left and right hand sides, we obtain the coefficients given in 
Lemma 9.7. 
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5 Perturbed Markov Chains 


In this section it is shown how the results of the present paper can be applied in the 
special case of perturbed discrete time Markov chains. As an illustration, we present 
a simple numerical example. 


For every ¢ > 0, let n®, n=0,1,..., be a homogeneous discrete time Markov 
chain with state space X = {0, 1,..., N}, an initial distribution p”? = Pini” =i}, 


i € X, and transition probabilities 
( ‘i ‘ i, 3 
po sri =F =e i,j eX. 


This model is a particular case of asemi-Markov process. In this case, the transition 
probabilities are given by 


OW = 9 46H), 0H 12.0 G7 SX, 


Furthermore, mixed power-exponential moment functionals for transition proba- 
bilities take the following form, 


CO 
Di (0, r= reo = pe i péeER, r=0,1,..., 1,7 € X. (7) 
n=0 


Conditions A-D and P, imposed in Sect. 2 now hold if the following conditions 
are satisfied: 


AY: a >0,i, 7 #0. 
B’: ed (n) is non-periodic for some i 4 0. 


Pi: pt = pi? + pillle +--+ pilkle* + o(e%), i, j #0, where |p;j[n]| < 00, 
n=1,...,k,i,j #0. 


Let us here remark that in order to construct an asymptotic expansion of order k 
for the quasi-stationary distribution of a Markov chain, it is sufficient to assume that 
the perturbation condition holds for the parameter k, and not for k + | as needed for 
semi-Markov processes. The stronger perturbation condition with parameter k + | is 
needed in order to construct the asymptotic expansions given in Eq. (14). However, 
for Markov chains these expansions can be constructed under the weaker perturbation 
condition. This follows from results given in [22]. 

It follows from (37) and Pj, that the coefficients in the perturbation condition P, 
are given by 


piilp, rn] =e" pi[n], r=0,...,k, n=0,...,.k-7 1,7 £0. (8) 


Let us illustrate the remarks made above by means of a simple numerical example 
where we compute the asymptotic expansion of second order for the quasi-stationary 
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distribution of a Markov chain with four states. We consider the simplest case where 
transitions to state 0 is not possible for the limiting Markov chain. In this case, exact 
computations can be made and we can focus on the algorithm itself and need not 
need to consider possible numerical issues. 

We consider a perturbed Markov chain n®, n=0,1,..., on the state space 


X = {0, 1, 2, 3} with a matrix of transition probabilities given by 


‘gaya 1=e oe 0 
Pij l-e* 0 O «€* 
_ ent se? se 7° 0 


,€>0. (39) 


First, using the well known asymptotic expansion for the exponential function, 
we obtain the coefficients in condition P},. The non-zero coefficients in this condition 
take the following numerical values, 


P2l0]}=1, prs[0]}=1, psi [0] = 1/2, p32[0] = 1/2, 
P2tl]=—-1, p»ll]=—1, p3ifl]=—1, pst] =—-1, (40) 
Pi2[2] = 1/2, pr3l2] = 1/2, pul2]=1, ps2l2]=1. 


Then, the root of the characteristic equation for the limiting Markov chain needs to 
be found. Since o (0) = Pi {vy > ye } = 1, we have p = 0. In the case where 
transitions to state 0 is possible also for the limiting Markov chain, the root p© needs 
to be computed numerically. 

Now, using that p = 0 and relations (38) and (40), we obtain the coefficients 
in condition Px. 

Next step is to determine the coefficients in the expansions given in Eqs. (14) 
and (15) for the case where k = 2 and i is some fixed state which we can choose 
arbitrarily. Let us choose i = 1. In order to compute these coefficients we apply the 
results given in [22]. According to these results, we can, based on the coefficients in 
condition P,, compute the following asymptotic vector expansions, 


& (0,0) = ,[0, 0, 0] + ©,[0, 0, Le + ,[0, 0, 2Je? + 0(€2), (41) 


(0, 1) = &[0, 1, 0] + ©;[0, 1, lJe + o(e), 
© (0, 2) = ©,[0, 2, 0] + 0(1), 


and, for j = 1, 2, 3, 
Q' (0, 0) = 21 j[0, 0, 0] + 21j[0, 0, He + 21j[0, 0, Ze” + 0(e7), (42) 
QO, 1) = Q1;[0, 1, O] + 21j[0, 1, He + of), 


2Y) 0, 2) = Q1;[0, 2, 0] + o(1), 
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where 
&0,r) = [6 0,r) 60,7) 20,r)] . r=0,1,2, 


and 


2290, r) = [o®, 0,7) 08.0.7) 0.0,n] ,r=0,1,2, 7=1,2,3 
lj ’ O11; oe 71; Pr O31; Pr > rtres J ca a 


For example, the coefficients in (41) take the following numerical values, 


1 =) 67/2 
®,[0,0,0] =| 1], ©:[0,0,1)=|-6], ,[0,0,2]}=| 27 |, 
1 =5 43/2 
(43) 
5 —47 33 
®,[0, 1,0] =| 4], ©,[0, 1,1] = | —36| , ,[0, 2,0] = | 24 
3 =97 17 


In particular, from (41) and (42) we can extract the following asymptotic expan- 
sions, 


(0, 0) = bi[0, 0] + bi[0, Le + bi[0, 2le? + o(e?), 
0, 1) = bill, 0] + bill, Ue + o(8), 
0, 2) = b,[2, 0] + 0(1), 


and, for j = 1, 2, 3, 


oO, 0) = a, ;(0, 0] + a, ;[0, lle + a, ;[0, 2]e? + o(é?), 
017; 0, 1) = aij[1, 0] + ail, He + of), 
oi; (0, 2) = a);[2,0] + o(1). 


From (41) and (43) it follows that 
(44) 


By first calculating the coefficients in (42), we then get the following numerical 
values, 


ai1[0, 0] = 1, ai2[0,0] =2, — aj3[0, 0] = 2, 

ayi[0, 1] = 0, ay2[0, 1] = —8, ay3[0, 1] = —10, 

ayi[0, 2] = 0, ay2[0, 2] = 34, ay3[0, 2] = 43, (45) 
ayi{1,0] = 0, ay2[1,0] = 6, — ai3[1, 0] = 8, 

ayi[1, 1] = 0, ay2[1, 1] = —48, ay3[1, 1] = —64, 

ay1[2, 0] = 0, ay2[2, 0] = 34, ay3[2, 0] = 48. 
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The asymptotic expansion for the quasi-stationary distribution can now be com- 
puted from the coefficients in Eqs. (44) and (45) by applying the lemmas in Sect. 4. 

From Lemma 9.5 we get that the asymptotic expansion for the root of the char- 
acteristic equation is given by 


p =cye + ce? + o(¢”), 


where 


b,[0, 1 b,[0, 2] + bill, Ver + bi[2, O]c7/2 1 
CcCj= = , o= — . 
b[1,0] 5 . b,[1, 0] 125 


a, 
~ 


Then, Lemma 9.6 gives us the following asymptotic expansions, 
oi) (o) = dij[0] + dijlle + dij[2]e? + of€?), j = 1,2,3, 
where 


dy ;[0] = a,;[0, 0], (47) 
dq, (1) = a1, [0, 1] + a,;(1, Oe, 
dy [2] = a1 ;[0, 2] + ai j[1, ler + aif, Oler + a1 j[2, Olt /2. 


From (45)-(47), we calculate 


d\,[0] = 1, di2[0] = 2, ae 2, 
dy (1] = 0, dill] = 2/5, 9 d3[1] = 6/5, (48) 
diy[2] = 0, dy2[2] = 9/125, ay3[2] = 47/125. 


Finally, let us use Lemma 9.7. First, using (48), we get 


€1[0] = dy;[0] + di2[0] + d)3[0] = 5, (49) 
etl] = dy (1) 4+ dpfl Wes ]=8/5, 
€1[2] = dy1[2] + dy2[2] + dy3[2] = 56/125. 


Then, we can construct the asymptotic expansion for the quasi-stationary distri- 
bution, 
x = 7,(0) + 2,[1]e + 2;[2]e? + of”), j = 1, 2,3, 


where 


a lO] = a ;[0]/e,[0], (50) 
mj[1] = (d1jL1] — e1[1]2;[0])/e1[0], 
7 j{2] = (dij [2] — e1[2]; [0] — e:[1],[1])/e1[0]. 
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Using (48)-(50), the following numerical values are obtained, 


[0] = 1/5, m2[0] = 2/5, m3[0] = 2/5, 
m[1] = —8/125, m[1] = —6/125,  m3[1] = 14/125, 
7 [2] = 8/3125, m2[2] = —19/3125, m3[2] = 11/3125. 


Note here that (7;[0], 22[0], 73[0]) is the stationary distribution of the limiting 
Markov chain. It is also worth noticing that 1 [n] + 22[n] + 23[n] = 0 forn = 1, 2, 
as expected. 
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Asymptotic Expansions for Stationary 
Distributions of Perturbed Semi-Markov 
Processes 


Dmitrii Silvestrov and Sergei Silvestrov 


Abstract New algorithms for computing asymptotic expansions for stationary 
distributions of nonlinearly perturbed semi-Markov processes are presented. The 
algorithms are based on special techniques of sequential phase space reduction, 
which can be applied to processes with asymptotically coupled and uncoupled finite 
phase spaces. 


Keywords Semi-Markov process - Birth-death-type process - Stationary 
distribution - Hitting time - Nonlinear perturbation - Laurent asymptotic expansion 


1 Introduction 


In this paper, we present new algorithms for construction asymptotic expansions for 
stationary distributions of nonlinearly perturbed semi-Markov processes with a finite 
phase space. 

We consider models, where the phase space of embedded Markov chains for 
pre-limiting perturbed semi-Markov processes is one class of communicative states, 
while the phase space for the limiting embedded Markov chain can consist of one 
or several closed classes of communicative states and, possibly, a class of transient 
states. 

The initial perturbation conditions are formulated in the forms of Taylor asymp- 
totic expansions for transition probabilities of the corresponding embedded Markov 
chains and Laurent asymptotic expansions for expectations of sojourn times for 
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perturbed semi-Markov processes. Two forms of these expansions are considered, 
with remainders given without or with explicit upper bounds. 

The algorithms are based on special time-space screening procedures for sequen- 
tial phase space reduction and algorithms for re-calculation of asymptotic expansions 
and upper bounds for remainders, which constitute perturbation conditions for the 
semi-Markov processes with reduced phase spaces. 

The final asymptotic expansions for stationary distributions of nonlinearly per- 
turbed semi-Markov processes are given in the form of Taylor asymptotic expansions 
with remainders given without or with explicit upper bounds. 

The model of perturbed Markov chains and semi-Markov processes, in particu- 
lar, in the most difficult case of so-called singularly perturbed Markov chains and 
semi-Markov processes with absorption and asymptotically uncoupled phase spaces, 
attracted attention of researchers in the mid of the 20th century. 

The first works related to asymptotical problems for the above models are 
Meshalkin [221], Simon and Ando [323], Hanen [106-109], Kingman [169], 
Darroch and Seneta [65, 66], Keilson [160, 161], Seneta [273-275], Schweitzer 
[265] and Korolyuk [177]. 

Here and henceforth, references in groups are given in the chronological order. 

The methods used for construction of asymptotic expansions for stationary dis- 
tributions and related functionals such as moments of hitting times can be split in 
three groups. 

The first and the most widely used methods are based on analysis of generalized 
matrix and operator inverses of resolvent type for transition matrices and operators 
for singularly perturbed Markov chains and semi-Markov processes. Mainly mod- 
els with linear, polynomial and analytic perturbations have been objects of studies. 
We refer here to works by Schweitzer [265], Turbin [345], PoliS¢uk and Turbin 
[256], Koroljuk, Brodi and Turbin [179], Pervozvanskiiand Smirnov [247], Courtois 
and Louchard [59], Korolyuk and Turbin [195, 196], Courtois [57], Latouche and 
Louchard [209], Kokotovi¢, Phillips and Javid [170], Korolyuk, Penev and Turbin 
[190], Phillips and Kokotovié [253], Delebecque [67], Abadov [1], Silvestrov and 
Abadov [310-312], Kartashov [151, 158], Haviv [112], Korolyuk [178], Stewart and 
Sun [339], Haviv, Ritov and Rothblum [121], Haviv and Ritov [119], Schweitzer 
and Stewart [272], Stewart [335, 336], Yin and Zhang [354-357], Avrachenkov 
[26, 27], Avrachenkov and Lasserre [34], Korolyuk, V.S. and Korolyuk, V.V. [180], 
Yin, G., Zhang, Yang and Yin, K. [359], Avrachenkov and Haviv [31, 32], Craven 
[64], Bini, Latouche and Meini [46], Korolyuk and Limnios [187] and Avrachenkov, 
Filar and Howlett [30]. 

Aggregation/disaggregation methods based on various modification of Gauss 
elimination method and space screening procedures for perturbed Markov chains 
have been employed for approximation of stationary distributions for Markov chains 
in works by Coderch, Willsky, Sastry and Castafion [53], Delebecque [67], Gaitsgori 
and Pervozvanskii [89], Chatelin and Miranker [52], Courtois and Semal [61], Seneta 
[277], Cao and Stewart [51], Vantilborgh [347], Feinberg and Chiu [82], Haviv [113, 
115, 116], Sumita and Reiders [342], Meyer [224], Schweitzer [269], Stewart and 
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Zhang [340], Stewart [333], Kim and Smith [168], Marek and Pultarova [218], Marek, 
Mayer and Pultarova [217] and Avrachenkov, Filar and Howlett [30]. 

Alternatively, the methods based on regenerative properties of Markov chains 
and semi-Markov processes, in particular, relations which link stationary probabil- 
ities and expectations of return times have been used for getting approximations 
for expectations of hitting times and stationary distributions in works by Grassman, 
Taksar and Heyman [94], Hassin and Haviv [111] and Hunter [140]. Also, the above 
mentioned relations and methods based on asymptotic expansions for nonlinearly 
perturbed regenerative processes developed in works by Silvestrov [301, 304, 305], 
Englund and Silvestrov [77], Gyllenberg and Silvestrov [99, 100, 102, 104], Englund 
[75, 76], Ni, Silvestrov and Malyarenko [243], Ni [238-242], Petersson [248, 252] 
and Silvestrov and Petersson [318] have been used for getting asymptotic expansions 
for stationary and quasi-stationary distributions for nonlinearly perturbed Markov 
chains and semi-Markov processes with absorption. 

We would like to mention that the present paper contains also a more extended 
bibliography of works in the area supplemented by short bibliographical remarks 
given in the last section of the paper. 

In the present paper, we combine methods based on stochastic aggregation/disag- 
gregation approach with methods based on asymptotic expansions for perturbed 
regenerative processes applied to perturbed semi-Markov processes. 

In the above mentioned works based on stochastic aggregation/disaggregation 
approach, space screening procedures for discrete time Markov chains are used. A 
Markov chain with a reduced phase space is constructed from the initial one as 
the sequence of its states at sequential moment of hitting into the reduced phase 
space. Times between sequential hitting of a reduced phase space are ignored. Such 
screening procedure preserves ratios of hitting frequencies for states from the reduced 
phase space and, thus, the ratios of stationary probabilities are the same for the initial 
and the reduced Markov chains. This implies that the stationary probabilities for the 
reduced Markov chain coincide with the corresponding stationary probabilities for 
the initial Markov chain up to the change of the corresponding normalizing factors. 

We use another more complex type of time-space screening procedures for semi- 
Markov processes. In this case, a semi-Markov process with a reduced phase space 
is constructed from the initial one as the sequence of its states at sequential moment 
of hitting into the reduced phase space and times between sequential jumps of the 
reduced semi-Markov process are times between sequential hitting of the reduced 
space by the initial semi-Markov process. Such screening procedure preserves tran- 
sition times between states from the reduced phase space, i.e., these times and, thus, 
their expectations are the same for the initial and the reduced semi-Markov processes. 

We also formulate perturbation conditions in terms of asymptotic expansions 
for transition characteristics of perturbed semi-Markov processes. The remainders 
in these expansions and, thus, the transition characteristics of perturbed semi- 
Markov processes can be non-analytical functions of perturbation parameters that 
makes difference with the results for models with linear, polynomial and analytical 
perturbations. 
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We employ the methods of asymptotic analysis for nonlinearly perturbed regener- 
ative processes developed in works by Silvestrov [301, 304, 305] and Gyllenberg and 
Silvestrov [99, 100, 102, 104] and applied to nonlinearly perturbed semi-Markov 
processes. However, we use techniques of more general Laurent asymptotic expan- 
sions instead of Taylor asymptotic expansions used in the above mentioned works 
and combine these methods with the aggregation/disaggregation approach instead of 
using the approach based on generalized matrix inverses. This permits us consider 
perturbed semi-Markov processes with an arbitrary communication structure of the 
phase space for the limiting semi-Markov process, including the general case, where 
this phase space may consist from one or several closed classes of communicative 
states and, possibly, a class of transient states. 

Another new element is that we consider asymptotic expansions with remainders 
given not only in the form o(-), but, also, with explicit upper bounds. 

It should be mentioned that the semi-Markov setting is an adequate and necessary 
element of the method proposed in the paper. Even in the case, where the initial 
process is a discrete or continuous time Markov chain, the time-space screening 
procedure of phase space reduction results in a semi-Markov process, since times 
between sequential hitting of the reduced space by the initial process have distribu- 
tions which can differ of geometrical or exponential ones. 

Also, the use of Laurent asymptotic expansions for expectations of sojourn times 
of perturbed semi-Markov processes is also a necessary element of the method. 
Indeed, even in the case, where expectations of sojourn times for all states of the 
initial semi-Markov process are asymptotically bounded and represented by Taylor 
asymptotic expansions, the exclusion of an asymptotically absorbing state from the 
initial phase space can cause appearance of states with asymptotically unbounded 
expectations of sojourn times represented by Laurent asymptotic expansions, for the 
reduced semi-Markov processes. 

The method proposed in the paper can be considered as a stochastic analogue of 
the Gauss elimination method. It is based on the procedure of sequential exclusion 
of states from the phase space of a perturbed semi-Markov process accompanied 
by re-calculation of asymptotic expansions penetrating perturbation conditions for 
semi-Markov processes with reduced phase spaces. The corresponding algorithms 
are based on some kind of “operational calculus” for Laurent asymptotic expansions 
with remainders given in two forms, without or with explicit upper bounds. 

The corresponding computational algorithms have an universal character. As was 
mentioned above, they can be applied to perturbed semi-Markov processes with an 
arbitrary asymptotic communicative structure and are computationally effective due 
to recurrent character of computational procedures. 

In conclusion, we would like to point out that, by our opinion, the results presented 
in the paper have a good potential for continuation of studies (asymptotic expansions 
for high order power and exponential moments for hitting times, aggregated time- 
space screening procedures, asymptotic expansions for quasi-stationary distributions, 
etc.). We comment some prospective directions for future studies in the end of the 


paper. 
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The paper includes seven sections. In Sect.2, we present so-called operational 
rules for Laurent asymptotic expansions. In Sect.3, we formulate basic perturbation 
conditions for Markov chains and semi-Markov processes and give basic formulas for 
stationary distributions for semi-Markov processes, in particular, formulas connect- 
ing stationary distributions with expectations of return times. In Sect.4, we present 
an one-step procedure of phase space reduction for semi-Markov processes and algo- 
rithms for re-calculation of asymptotic expansions for transition characteristics of 
perturbed semi-Markov processes with a reduced phase space. In Sect. 5, we present 
algorithms of sequential reduction of phase space for semi-Markov processes. In 
Sect. 6, we present algorithms for construction of asymptotic expansions for station- 
ary distributions for nonlinearly perturbed semi-Markov processes and main results 
of this paper formulated in Theorems 10.8 and 10.9. In Sect.7, we present some 
directions for future studies and short bibliographical remarks concerned works in 
the area. 

We would like to conclude the introduction with the remark that the present 
paper is a slightly improved version of the research report Silvestrov, D. and 
Silvestrov S. [320]. 


2 Laurent Asymptotic Expansions 


In this section, we present so-called operational rules for Laurent asymptotic expan- 
sions. We consider the corresponding results as possibly known, except, some of 
explicit formulas for remainders, in particular, those related to product, reciprocal 
and quotient rules. 


2.1 Definition of Laurent Asymptotic Expansions 


Let A(e) be a real-valued function defined on an interval (0, €9], forsome 0 < €9 < 1, 
and given on this interval by a Laurent asymptotic expansion, 


ACE) = aye” +++ + aye + o4(e"), () 


where (a) —0o < hy < ka, < © are integers, (b) the coefficients ap,,..., az, are real 
numbers, (c) function 04 (2) /eka > Oase— 0. 

We refer to such Laurent asymptotic expansion as a (ha, ka)-expansion. 

We say that (i, k4)-expansion A(¢) is pivotal if it is known that a,, 4 0. 

A Laurent asymptotic expansion A(e) can also be referred as a Taylor asymptotic 
expansion if hy > 0. 

We also say that (44, ka)-expansion A(¢) is a (ha, ka, 54, Ga, €4)-expansion if its 
remainder 0,4(e) satisfies the following inequalities (d) |o4 (e)| < Gye*s, for 
0 < € < &4, where (e) 0 < 64 < 1,0 < G4 < coand0 < &4 < &p. 
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In what follows, [a] is the integer part of a real number a. 

Also, the indicator of relation A = B is denoted as /(A = B). It equals to 1, if 
A=B,or0,ifA AB. 

It is useful to note that there is no sense to consider, it seems, a more general case 
of upper bounds for the remainder o,(e™), with parameter 54 > 1. Indeed, let us 
define k, = ka + [54] — 1(64 = [84]) and 6, = 54 — [54] +164 = [64]) € (0, 1]. 

The (M4, ka, 64, Ga, €4)-expansion (1) can be re-written in the equivalent form of 
the (hg, k,, 5,, Ga, €4)-expansion, 


A(é) = aye" feeet ay, e™ Ae Qe 3s Oe o'(e%), (2) 


with the remainder term o’, (eka) = o4(e™), which satisfies inequalities |o’, (eks)| = 
loa(e)| < Gaett = Gyekatos, for 0 < € < eq. 

Relation (2) implies that the asymptotic expansion A(¢) can be represented in 
different forms. In such cases, we consider a more informative form with larger 
parameters hy and k,. As far as parameters 64, G4 and €, are concerned, we consider 
as a more informative form, first, with larger value of parameter 54, second, with 
smaller values of parameter G4 and, third, with the larger values of parameter 4. 

In what follows, a V b = max(a, b), a A b = min(a, b), for real numbers a and b. 

It is useful to note that formula (1) uniquely defines coefficients ap,,..., ag, 
Lemma 10.1 If function A(e) = aye! +--+ + ay eh + 04 (€) = aye! + 
veep aves + 0% (ek), & € (0, e9] can be represented as, respectively, (h',, k,)- and 
(hi, k{)-expansion, then the asymptotic expansion for function A(e) can be rep- 
resented in the following the most informative form A(€) = ane +--+ +a, ght 4 
oa(e™), € € (0, 0] of (ha, ka)-expansion, with parameters ha = hi VI ka =k Vv 
ki, and coefficients an,,..., Ax, and remainder o, (e') given by the following rela- 
tions: 


(i) aj, a/ =0, forl < hy. - 
(ii) a) =a, =ay, forha <1 <kg =k AK). 
(iii) a; =a", forks =k, <1 < ky if ki, < ki. 
(iv) a) =a’, forks =ki <1 < ky ifki < k,. 
(v) The remainder term o,(e) is given by the following relation, 


of, (e*) if ki, < kf, 
one) = 4 oy (es) = of(e) if ky = ky, (3) 
o', (6%) fk, > ke), 


The latter asymptotical expansion is pivotal if and only if an, = ai, = a, #0. 


It is useful to make some additional remarks. 

The case ka < hy is possible. In this case, the set of integers / such that 
ha <1 < ky, is empty. This can happen if kK < hy or ky < hi’. In the first case, all 
coefficients a) = 0,1 =h',,...,k, while hy = hi, k = ki, a/ =a),1=hy,..., ka. 
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In the second case, all coefficients a = 0,1 = hy, ..., ki while ha = h),,ka = ki, 
aj =a)j,l=hya,...,ka. 

If ki, = ki then hy < ka = ka and the set of integers / such that ka <1 <k, is 
empty. In this case, all coefficients a; = a =al,l=hyg,...,ka. 


If a, # 0 then hy = hi’, and a), = ay # 0. If ay # 0 then hy = hi, and a), = 
ayy #0. Ifa), ayy # 0 then hy = h', = hi and a, = ay = = ay #0. 
The following proposition supplements Lemma 10.1. 


Lemma 10.2 If — A(e) = dj, € BM see oe ae ay € e& + oj,(e4) = ae fea 
eka 4 of (e&4), ¢ € (0, e9] can be represented as, respectively, (H,, K, , 54, G4, €4)- 
and cn ky, 64, Gy, €4)-expansion, then: 


(i) The asymptotic expansion A(€) = ap, i ee Ak, ef + o4(e), & € (0, &] 
given in Lemma 10.1 is an (ha, ka, 64, Ga, €4)-expansion with parameters 
G4, 64 and &,4 which can be chosen in the following way consistent with the 
priority order described above: 


(8%, Gi, ef if ki, < Ki, 

8, Gi. af if ky = ky, 84 < 84, 
(1,Ga,ea) = 1 8, = 8, GAG oh Ae!) fk, =k 8 = 8%, 4) 

(8, G,, €4) if ki, = kl 8) > 8 

(8,, Gy, €4) if ki, > Ki. 


(ii) The asymptotic expansion A(e) can also be represented in the form A(e) = 
Gj, wel feet a, € Ka 464 (ef) ofan (ha, ka, 5a, Ga, €,)-expansion, with para- 
fete ha = = hy, ka= = ki, A ki and parameters 5a, Ga, & €, given by the following 
formulas, 


! en) ” 
; 5 if, < kN, 
ie =] SAS fk, HK, 
/ ” 
SY fk’, > KY, 
~I—k— ahi +60 — 
Or (Ey eagle, © Gah ead, 
G = _ ea 5a “yf __ pl 
: Gata’ Gaba Iki 8% Ki, +84—ki 3% ha = hy 
Wa ai Ka— 04 IB +54 —ki 84 +e ys ” 
Gh pop lde, +e! ) ifky > kK 


Eq = 6, A €4. (5) 


(iii) The remainders o', (eka), on '(eks), o4(e") and 64(e™) are connected by the fol- 
lowing relations: 


bale) = oa(e) + >) ae! 


ka <Il<ka 
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0, (eM) if ky < kj, 
= 1046!) = of(e) ifky = ky, (6) 
of (e%) if ki, > Ky. 


2.2 Operational Rules for Laurent Asymptotic Expansions 


Let us consider four Laurent asymptotic expansions, A(e) = aj,e! +---+ 
aye" +oy(e“), Ble) = bj, e" +--+ bye + og(s*), Cle) = Chee" +.-- 
+ ce + oc(e*), and D(e) = dye’ +--+ + dh, e + op(e”) defined for 0 < 
€ < &9, for some 0 < & < 1. 

The following lemma presents “operational” rules for Laurent asymptotic expan- 
sions. 


Lemma 10.3 The above asymptotic expansions have the following operational rules 
for computing coefficients: 


(i) If A(e), € € (0, €0] is a (hy, ka)-expansion and c is a constant, then C(e) = 
cA(e), € € (0, €0] is a (he, kc)-expansion with parameters hc = ha, kc = ka 
and coefficients, 

Chetr = Canc tr, F = 0,...,ke — he. (7) 


This expansion is pivotal if and only if ch, = can, # 9. 

(ii) If A(e), € € (0, €0] is a (ha, ka)-expansion and B(e), € € (0, €0] is a (hp, kp)- 
expansion, then C(€) = A(e) + Ble), € € (0, €0] is a (he, kc)-expansion with 
parameters hc = ha \ hg, kc = ka A kp, and coefficients, 


Chetr = Anc+r + Onetr, ¥ =9,...,ke — he, (8) 


where dn-4+ = 0 for0 < r < ha — he and by,4- = 0 for0 <r < hg — he. 
This expansion is pivotal if and only if Che = Ane + Dae A 9. 

(iii) If A(e), ¢ € (0, €0] is a (hg, ka)-expansion and B(e), € € (0, €o] is a (hp, kg)- 
expansion, then C(€) = A(e)- Ble), € € (0, €0] is a (hc, kc)-expansion with 
parameters hc = ha + hp, kc = (ha + kp) A (hp + ka), and coefficients, 


Cn = DS Gaede af SO, on. gke— he: (9) 


O<i<r 


This expansion is pivotal if and only if che = Gn, bn, F 9. 

(iv) IfB(e), € € (0, €9] is a pivotal (hg, kg)-expansion, then there exists 0 < € < &0 
such that B(e) #0, € (0, €], and C(e) = 1/B(e), € € (0, €] is a pivotal 
(hc, kc)-expansion with parameters hc = —hg, kc = kp — 2hpg and coeffi- 
cients, 
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-1 = 
Cho = Dy,» Chc+r = —On, > DngtiChc+r—is T= 1,...,ke —he. (10) 


1<i<r 


(v) If A(e),¢ € (0, €o] is a (ha, k,)-expansion B(e),¢ € (0, €9] is a pivotal 
(hg, kp)-expansion, then, there exists 0 < & < & such that B(e) #0,€ € 
(0, 9], and D(e) = A(e)/B(e), € € (0, €6] is a (Ap, kp)-expansion with para- 
meters hp = ha — hg, kp = (ka — hp) A (ha + kp — 2hp), and coefficients, 


dinar = Chesidhyer is = 05-2 ko = he, (1) 
O<i<r 
where Cn-+j,j = 9,...,kce — he are coefficients of the (hc, kc)-expansion C(¢) = 


1/B(e) given in the above proposition (iv), or by formulas, 


Gino (es -»> beth) .r=0,...,kp—hp. (12) 


1l<i<r 
This expansion is pivotal if and only if dy = AnyCne = Any [Vig FA 9. 


The following proposition presents “operational” rules for computing parameters 
of upper bounds for remainders of Laurent asymptotic expansions. 


Lemma 10.4 The above asymptotic expansions have the following operational rules 
for computing remainders: 


(i) IfA(e), € € (0, €0] is a (ha, ka, 64, Ga, €4)-expansion and c is a constant, then 
C(e) = cA(s), € € (0, €0] is a (hc, kc, 6c, Gc, €c)-expansion with parameters 
hc = ha, kc = ka, coefficients c,,r = hc, ...,kc given in proposition (i) of 
Lemma 10.3, and parameters 5c, Gc, €c given by the following formulas, 


dc = 64, Gc = |clGa, Ec = Ea. (13) 


(ii) If A(e), € € (0, €0] is a (hag, ka, 64, Ga, €4)-expansion and B(e), € € (0, €o] is 
a (hp, kp, 5p, Gp, &g)-expansion, then C(¢) = A(e) + Ble), € € (0, €0] is a 
(hc, kc, 6c, Gc, €c)-expansion with parameters hc = ha A hp, kc = ka A kp, 
coefficients c,,r =hc,...,kc given in proposition (ii) of Lemma 10.3, and 
parameters 5c, Gc, &c given by the following formulas, 


5A ifkc =k, < kp, 
6c = oa A OB ifkc =ka = kp, 
op ifkc = kp < ka, 


> da A bp, 
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kat+6a—ke—6 > i—kc—6 
Go=Gaee™ ic Co lailec * ic 


ko <ix<k, 
kgp+édp—kc—6 , j—kc—5 
+ Gree BKC Cae Ibilec Cc c 
ko <jske 
Ec = EA A Ep. (14) 


(iii) If A(e), ¢ € (0, €o] is a (ha, ka, 64, Ga, €4)-expansion and B(e), € € (0, €o] 
is a (hg, kp, 5g, Gp, €g)-expansion, then C(e) = A(e)- Ble), € € (0, €o] is 
a (hc, kc, 8c, Gc, €c)-expansion with parameters hc = ha + hg, kc = (ha + 
kp)A (hg +ka), coefficients c,,r =hc,...,kc given in proposition (iii) of 
Lemma 10.3, and parameters 5c, Gc, €c given by the following formulas, 


ba ifkc =hg+ka <ha+ke, 
bc = 4 OANA OB ifke = hg + ka = ha + kp, 
5B ifkc =ha + kp < hg t+ka, 
= ba A oz, 
Gc = >a lai||bjlec? 


ke <itj.hasiska hp s<jskp 


+ Gy > [bile tate ee 


hp <j<kp 


+62 > laylefterete-& 


ha Sisk, 


katkpt6a+dp—kc—5 
+ GaGge gp eteats C on 


Eo = 64 A eR: (15) 


(iv) If B(e), € € (0, €0] is a pivotal (hg, kg, 5g, Gg, &€g)-expansion, then, there exist 
Ec < & < &0 such that B(e) £0, € € (0, eG], and C(e) = 1/B(«), € € (0, e] 
is a pivotal (hc, kc, 6c, Gc, €c)-expansion with parameters hc = —hp, kc = 
kg — 2hp, coefficients c,,r=hc,...,kc given in proposition (iv) of 
Lemma 10.3, and parameters 5c, Gc, €c given by the following formulas, 


dc = 6p, 
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binl\ St eiye 
Ge= ( He > Ibillejlec” kp+hp—65p 


kp—hp <i+j,hp<ix<kphco<j<kc 


[Png i-hg—-1 
7 (48 Ibileg 


-1 
pn, Lk Gpelpt-e-') if hp < kp, (16) 
1 
bn 5B ° 
(3) ° if hp = kp. 


(v) If Ale), € € (0, €o] is a (hy, ka, 84, Ga, &4)-expansion, B(e), € € (0, €9] is a 
pivotal (hg, kg, dg, Gg, €g)-expansion, then, there exist ep < € < &o such that 
Ble) £0,¢ € (0, e9], and D(e) =A(e)/Ble) is a (hp, kp, 5p, Gp, &p)- 
expansion with parameters hp =ha +hc = ha — hp, kp = (ka thc) A 
(hg + kc) = (ka — hp) A (ha + kp — 2hp), coefficients d,,r =hp,...,kp 
given in proposition (v) of Lemma 10.3, and parameters 5p, Gp, Ep given by 


the following formulas, 
5A ifkp =hco +k, <ha tke 
dp = 4 a NOc ifkp =hco + kg =ha tke, 
Ya ifkp =ha tke <ho+ka, 


> b4 Adc = b4 A Op, 


itj—kp—6 
Gp = > laillcjlep” 


kp <itj,hasiska hosjske 


jt+k,t+éd4—kp—6, 
Bar en om Iejle ‘At64—kp—dp 


he <jskc 


+ Ge a lailep cere 


ha Sisk, 


katkc+64+65c—kp—5p 
+ GaGce ; 


Ep = E4 A EC, (17) 


where coefficients c,,r =hc,...,kc and parameters hc, kc, dc, Gc, €c are 
given for the (hc, kc, 6c, Gc, €c)-expansion of function C(¢) = 1/B(e) in 
proposition (iv), or by formulas, 


OA ifkp = ka — hp < ha +kp — 2hep, 
op = 4 54 A dB ifkp = ka — hp = ha + kp — 2hep, 
Op ifkp = ha + kp — 2hp < ka — hep, 
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Ibig|\ ip a 
Go= (5 Dalles ee 


ka A (a tkp—hp) <isk, 


it+j—kp—hp—6, 
+ > lailldjlen7 


ka Aha t+kp—hg) <i+j,ha Sisk, hp <j<kp 


ka+6a—hp—kp—dp j+kg+dp—hp—kp—Sp 
+ Gasp +Gsp >. Idjle> 


hp <ji<kp 


\Png| i—hg—-1 
=e (Fs lbileg” 


-1 
Ep = Es A EBAY + eee) ifhp < kp, (18) 
1 
biel) 3p : 
(Se) . if hp = kp. 


In what follows, the following two lemmas, which present recurrent operational 
rules for computing coefficients and remainders for multiple summations and mul- 
tiplications of Laurent asymptotic expansions, will also be used. These lemmas are 
direct corollaries of Lemmas 10.3 and 10.4. 

Let Am(é) = Any, mE" feeeH Ak,,, mem + o(ekm), & € (0, Eo] be a 
(hag,,, ka,,)-expansion, for m=1,...,N, B,(e) = Ay(€) +--+: +An(E), € € 
(0, €o], and C,(€) = A; (e) x --- x An(e), € € (0, €0], forn = 1,...,N. 

The following two lemmas follow, respectively, from Lemmas 10.3 and 10.4 and 
recurrent relations B,(¢€) = By,_1(€) + An(e), € € (0, €9],n = 2,..., NandC,(e) = 
Cn_-1(€) - An(€), € € (0, €0],2 = 2,..., N, which hold for any N > 2. 


Lemma 10.5 The above asymptotic expansions have the following operational rules 
for computing coefficients: 


(i) IfAn(e), € € (O, €0] is a (hg,,, Ka,,)-expansion form = 1,...,N where N > 2, 
then By(€) = Diy, n€"® ++++ + Diy, ne% + o(e*n), € € (0, €0] is a (he,, ke,)- 
expansion, forn = 1,...,N, withhg, = ha,, kp, = ka, andhg, = min(hg,,..., 
ha,) =hg .A ha, kp, = min(ka,, ieee ka,) = kp, A ka,s n=2,...,N and 

the coefficients given by formulas Dig, +11 = Any +1,15 1=0,...,kg, -— 


1 
hp, = ka, — ha, and, forl=0,...,kg, —hg,,n =2,...,N, by formulas, 


n—1 
mn? 
Dhg,+in = hp, +11 +++ + Ang, +i.ns (19) 


or 
Dhy, +1n = Dig ,+Ln-1 + hg, +1,n> (20) 
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where Dig +t =0,/=0,..., hp 
ha,, = hg,,m => 1, wm siangolls 
Expansions B,(€),n =1,...,N are pivotal if and only if Diy, n= Any, + 
+ dn in FO,2=1,...,N. 

(ii) [fAm(e), € € (O, €0] is a (hg,,, Ka,,)-expansion form = 1,...,N where N > 2, 


then Cy(€) = Cho, né"™ + +++ + Cke, nek + o(ek), € € (0, €0] is a (he,, kc,)- 
expansion, forn = 1,...,N, withhc, = ha,, kc, = ka, and hc, = ha, + +++ + 


—hg, and apy, 414m =0,1=0,... 


n—1 


hy, = he, + ha, ke, = mink, + Dicrenrg tae! = 1,....n) = 
(hc,_, + ka,) A (ke,_, + ha,),1 = 2,...,N and coefficients given by formu- 
las, Che, +11 = hg, +113 1=0,...,kce, —he, = ka, — ha, and, forl!=0,..., 


kc, —hc,,n = 2,...,.N, by formulas, 


Cee = >» [| @%,+1.0 (21) 


Ute +h=l0sl:<kj, ha; i=1,...n lsisn 
or 
Che, +1n = > Che, +l n—1 hy, +1-I n+ (22) 
0<I’<I 
Expansions C,(€),n=1,...,N are pivotal if and only if Che,.n = ng, 1 X 


+X dn n FO,2=1,...,N. 

(iii) Asymptotic expansions for functions B,(€) = A\(€) +--+: + An(e), 
n=1,...,N and C,(e) = Aj(e) x +++ X An(e),n =1,...,N are invariant 
with respect to any permutation, respectively, of summation and multiplication 
order in the above formulas (19) and (21). 


Lemma 10.6 The above asymptotic expansions have the following operational rules 
for computing remainders: 


(i) If An(e),€ € (0, €0] is a (ha,, ka,,. 64,,, Ga,» €A,,)-expansion for m= 1, 
...,N where N > 2, then B,(e),€ € (0, €0] is a (hp,, kp,,5p,, Go, &B,)- 


expansion, forn = 1,...,N, with parameters hg, = ha,,kg, = ka, and hg, = 
min(ha, pi ee ha,) = he,_, A ha,» kp, = min(k,,, mui ka,) = kp A ka,> 
n=2,...,N, coefficients by, +i.n,1=0,..., kp, —hp,,2 = 1,...,N given in 
proposition (i) of Lemma 10.5 and parameters Gp, 5p,, €g,,1 = 1,...,N given 


by formulas 5p, = 64, = 5y = MiNy<m<n 54, 
n=2,...,N, by formulas, 


Gp, = Ga,, €s, = &a, and, for 


m? 


: * 
dg, = min d4,, > dy, 
meK, 
where K, = {m: 1 <m <n, kn» = min(ky,..., kn)}, 
ka, +8a; ky 8B j—kBy — 8B 
Gp, = > (Get + y Iaa,.il€p, ’ 


l<i<n Bn <I-Ska; 


€g, = min(é4,,.-.., €4,), 
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or by alternative recurrent formulas, 


5B,-1 if ke, = kp, < ka,» 
dg, = min d4,, = 4 da,., Ada, ifke, = kp,_. = ka,» 
meK, 
5A, if ke, = ka, < kp, 
> oy, 


kB, +B, —kBn —8Bn i—kp, —SB, 
Ge, = Gz,_.&3, + >) bbe, .il€s, 


key <i<kp,_; 
kin +5Ay —kBy — 6, i—kp, —8 
+ Ga, en a eae > | aa, jl cy : Bn Bn 
KB <i SkAn 
£B, = EB, \ EA, (24) 
(ii) If A,(e),€ € (0, €o] is a (ha, ka,» 54,,> Ga,» €A,,)-expansion for m= 1, 


...,N where N > 2, then C,(€), € € (0, €0] is a (he,, kc,, 5c,, Ge, cy)- 
expansion, for n=1,...,N, with parameters hc, = ha,,kc, = ka, and 
he, = ho, + ha, = ha, + +++ + ha, ke, = ho + ka) © Ke + fa) = 
miny<j<n (ka, + Lied ha,),n=2,...,N, coefficients Che, 41n,1 = 9, 

. kc, —hc,,n=1,...,N given in proposition (ii) of Lemma 10.5 and 


parameters 8¢,,Gc,, €c,,n = 1,...,N given by formulas 5c, = 54, = 8y = 
MiN}<m<n 5A,,, Gc, = Ga,, Ec, = €a, and, forn = 2,..., N, by formulas, 
dc, = min d4,, > dy, 


meLy, 


where 


L,=ym:l<m<n, [ka + > ha, 


l<r<n,rAm 


min | ka, + > ha, : 


1<l<n 
l<r<n,rAl 
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Ute +h kK n 5 n 
Go, = bs [] leasleg 


key <li te tAlnsha; Sliska, i=1,....n lSisn 


+2 TDC De lane, 


l<j<n IsisniAj ha, <l<ka; 


ka, 46a, ka, +64, —ken —8cn 
+ Ga,é Ec, '‘)Gae Ec, 


’ 


Ec, = min &4,. (25) 
l<i<n 
or by alternative recurrent formulas, 
8c,-1 if kc, = ha, + ke, < ho + ka, 
8c, = 7 94, 66,1 tke, = he + ka = ha + ke, 
ba, if kc, = he, + ka, < ha, + ke) 
* 
= ‘N°? 
itj—key Sen 

Gc, = > lec,.,il14a, jlEc, 

key <iti he, _, Sisk, 1 Man SiSKAn 

gltker _1+8e,-1 ~ken —8Cn 
+Go,. > lar iléc, 
Nan SiSKAn 
it+k,, +54, —kce, —8cy 
dy > Iec,.,il€c, 
Ney 1 Sisko, 
a aS 1154n+6e,_) Ken -8Cn 

+ G4, Go, , 

EC, = EC A EAn- (26) 
(iii) Parameters 5¢,, Gc,, €c,,n = 1,..., N in upper bounds for remainders in the 


asymptotic expansions for functions B,(¢)=A,(€)+---+Az(€), n= 
1,...,N and C,(€) = Ay(€) X +++ X An(e),n = 1,...,N are invariant with 
respect to any permutation, respectively, of summation and multiplication order 
in the above formulas (23) and (25). 


It should be noted that formulas (23) and (25) give, in general, the values, which 
are less or equal than the values for these constants given in alternative formulas, 
respectively, (24) and (26). 
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2.3 Proofs of Lemmas 10.1-10.6 


The formulas given in Lemmas 10.1 and 10.2 are quite obvious. The same relate to 
formulas and in propositions (i)—(ii) (the multiplication by a constant and summation 
rules) of Lemmas 10.3 and 10.4. They can be obtained by simple accumulation of 
coefficients for different powers of ¢ and terms accumulated in the corresponding 
remainders, as well obvious upper bounds for absolute values of sums of terms 
accumulated in the corresponding remainders. Lemmas 10.5 and 10.6 are corollaries 
of Lemmas 10.3 and 10.4. 

Let us, therefore, give short proofs of propositions (iii)-(v) of Lemmas 10.3 
and 10.4. 

Multiplication of asymptotic expansions A(¢) and B(e) penetrating proposition 
(iii) of Lemma 10.3 and accumulation of coefficients for powers e! forl=hc,...,kc 
yields the following relation, 


C(e) = A(e)Ble) 


= (ane J} ...4 aye } oa(e)) (by, et? feeet Dy, eX + op(e*8)) 


25 = aibje! 


he slske it+j=l,hasiska hp <jskp 


+ ~ ajbje't 


ko <itj,hasiska hp sj<kp 


v > bjeloa(e™) + > aje'op(e*®) + oa (e")op(e®) 


hp <j<kp ha Sisk, 
= > ce! + oc (ek), (27) 
he<l<ke 
where 
oc(e) = bs ajbje't! + >; bjeloa(e™) 
ko <itj.ha<iska hp <j<kp hp <j<kp 
+ >) ae'og(e') + on(e™)op(e"). (28) 
ha Sisk, 
Obviously, 
ke 
Ooc(é 
oF") §, 0280 <e +0. (29) 
€ 


It should be noted that the accumulation of coefficients for powers ¢! can be made 
in (27) only up to the maximal value / = kc = (ha + kp) A (hp + ka), because of 
the presence in the expression for the remainder oc(e*) terms bh, el® oa(e) and 
an, €" op (E"). 

Also, relation (28) readily implies relation (15), which determines parameters 
5c, Gc, €c in proposition (iii) of Lemma 10.4. 
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The assumptions of proposition (iv) in Lemma 10.3 imply that the following 
relation holds, 
e "8 B(e) > by, FOa0<e— 0. (30) 


This relation implies that there exists 0 < e, < € such that B(e) £0 for ¢ € 
(0, €6], and, thus, function C(e) = Re is well defined for ¢ € (0, €]. 
The assumptions of proposition (iv) of Lemma 10.3 also imply that, 


1 
hp = 
e®C(e) = 
(©) Diy + Dhg4ti€ seep by, eke —he + op (e*s Jew" 
1 
> — =c,as0<e—0. (31) 


hp 


This relation means that function e” C(e) can be represented in the form e!®C(e) 
= Cho + o(1), where cp, = ee or, equivalently, that the following representation 
holds, 


Cle) = ene + Ci(e), € € (0, €], (32) 
where Cle) 
n > O0as0<e—-0. (33) 
ev 


Relations (32) and (33) prove proposition (iv) of Lemma 10.3 for the case, where 
hg = kg that is equivalent to the relation hc = —hg = kc = kg — 2hz. 

Note that, in the case hg = kg, the asymptotic expansion (32) for function C(¢) 
can not be extended. Indeed, 


eM 1Cy(e) = eC) = Chee") 


_ Che op(el)e~™ oe 
Bhp + op(e™)e—* € 
The term aus on the right hand side in (34) has an uncertain asymptotic 
behaviour as 0 < ¢ > 0. 


Let us now assume that hg + 1 = kg that is equivalent to the relation hc = —hg = 
ke — Ll =kg — 2hg — 1. 

In this case, the assumptions of proposition (iv) of Lemma 10.3 and relations (32), 
(33) and (34) imply that 


el 1Cy(e) = el! (Cle) — chee) 
_ —Dng+1Che a op(e"8*!)e Che 
Dry + Dig +1 + Op(elet! ete 


Pig 41Che 
Diy 


—hpg-1 


= Cno41 as0<e— 0. (35) 
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This relation means that function ¢”8~'C,(¢) can be represented in the form 
gie-lc, (€) = Che+1 + 0(1), where Cno41 = bj, Png+1Ches or, equivalently, that the 
following representation holds, 


Ce) = cae + cae 8 + Ge), € € (, eg), (36) 
where C 
ie +> Oas0<e-0. (37) 
io B+1 


Relations (36) and (37) yields proposition (iv) of Lemma 10.3 for the case, where 
hg +1 = kp. 

Note that, in the case hg + | = kg, the asymptotic expansion (36) for function 
C(e) can not be extended. Indeed, 


eC (6) = 8 7 (C(e) — cye6  — Chg & 8) 


_ Che op(el#* ete! (38) 
Dig + Dngtié + Op(ehet! ete é 


The term ae on the right hand side in (38) has an uncertain asymptotic 
behaviour as 0 < ¢ > 0. 

Repeating the above arguments, we can prove that function C(e¢) can be rep- 
resented in the form of (Ac, kc)-expansion, with parameters hc, kc and coeffi- 


cients Cno,..-, Cke given in proposition (iii) of Lemma 10.3, for the general case, 
where hg + 1 = kg, or, equivalently, hc = —hg = kc —n = kg — 2hg — n, for any 
WO. Me das 


The (hc, kc)-expansion for function C(e) = io can be rewritten in the equivalent 


é) 
form of the following relation, 
1 = nge"® + +++ + digs" + oB(6) (cae +++ + ence + ocle)). 39) 


Proposition (iii) of Lemma 10.3, applied to the product on the right hand side 
in (39), permits to represent this product in the form of (h, k)-expansion with para- 
meters h = hg + hc = hg — hg = Oandk = (hg + kc) A (kp + hc) = (kp — hp) A 
(kg — 2hg + hg) = kp — hp. 

By canceling coefficient for ¢’ on the left and right hand sides in (39), for / = 
0, ..., kg — hg and then solving equation (39) with respect to the remainder oc (ek) 
permits to find the following formula for this remainder, 
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cigiti ol kp 
De tictis nectcke ikke CIGE 7? TH Dincxiek, GP One”) 
by et fteeet by, e*8 + op(e*) 

c.giti—he 
Duis hast ha<icks haxjete BIGE 
Dhy eee ge by, eke —he + op (e's )e—he 

ih k 
Lae cje!" ople™) 

Dhy eetarenels by, ek" + op (e's )ewhs : 


oc(e) = 


(40) 


The assumptions made in proposition (iv) of Lemma 10.4, imply that B(e) 4 0 
and the following inequality holds for 0 < ¢ < &¢, where €¢ is given in relation (16), 


Dh 
[Dig peek te be +4 op (eke | > | 7 


> 0, (41) 
The assumptions made in proposition (iv) of Lemma 10.4 and inequality (41) 
finally imply that the following inequality holds, for 0 < ¢ < &c¢, 


b -l 
loctet)| < eke 2hw tts (oe wt) 


i+j—kpthp—6 
x >» [billejleen ee 


kp—hp <i+j,hps<ix<kp hc<j<ke 


+6, > (gle). (42) 


hcS<jskc 


This inequality proofs the proposition (iv) of Lemma 10.4. 

The first statement of proposition (v) in Lemma 10.3 states that function D(e) 
can be represented as (hp, kp)-expansion with parameters hp, kp and coefficients 
dy,,+--, 4, given in this proposition and relation (11). It is the direct corollary 
of propositions (iii) and (iv) of Lemma 10.3, which, just, should be applied to the 
product D(e) = A(e) - oat é € (0, 4]. 

Note that, in this case, parameters hp = ha + hc = ha — hp and kp = (ka + 
hc) A (ha + ke) = (Ka — hp) A (ha + kp — 2hp). 

Now, when it is already proved that D(e) is (hp, kp)-expansion, its coefficients 
can be also computed by equalising coefficients for for powers ¢! for! = hp, ..., kp 
on the left and right hand sides of relation, 


A(e) = B(e)D(e) 
= (bye! +++ + dige™ + on(c)) 
x (dye +++» + dye + op(e)). (43) 
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This procedure yields the second statement of proposition (v) in Lemma 10.3 and 
the corresponding formulas given in relation (12). 

The first statement of proposition proposition (v) in Lemma 10.4 and relations 
(17) can be obtained by direct application of propositions (iii) and (iv) and relations 
(15) and (16) given in Lemma 10.4, to the product D(¢) = A(e) - a 

Proposition (iii) of Lemma 10.3, applied to the product on the right hand side in 
(43), permits to represent this product in the form of (h, k)-expansion with parameters 
h=hg t+ hp = hg + ha — hg = ha and k = (hg + kp) A (kp + Ap) = (hg + (Ka — 
hp) A (ha + kp — 2hp)) A (kp + ha — hp) = ka A (kp + 


ha — hp). 
By canceling coefficient for ¢! on the left and right hand sides in (43), for / = 
ha, ...,ka A (kp + hg — hp) and then solving Eq. (43) with respect to the remainder 


op(e*) yields the following formula for this remainder, 


1 ka 
Ali Hy —ts) eles HE Poale™) 


kp) 
E — 
op(é"?) bp,e'® + +--+ dy ek + op (ek) 


diciti 
Doky Ake tha—he) <i). hp<i<kp hip <j<kp PIGiE 
by,el® + +++ + bp, e + op(e*) 
vol kp 
ohn eieks dje/op(é™*) 


byel® + +++ + de® + 0p(e*) 


I-hp ka) o—he 

_ DuksAtketha—hs)<i<ka UE” + Oa(e™)E 

Dhy feet by, eke he + op (e's Jee 
A. giti-he 

Lents i Sein Ses bidjé 
Dhy he ge ef by, eke—he + op(e*s )e—he 
j—h k 
pares dje’”"" op(e™) 


Dn, ferret by, ek he + op (eke eae 


(44) 


The assumptions made in proposition (v) of Lemma 10.4 and inequality (41) 
finally imply that the following inequality holds, for 0 < ¢ < €p given in relation 
(18), 


lop(e”)| < gtotto (Paely-1 


l—kp—hpg—6, 
x > laile% p—hg—Sp 


ka A(kp+ha—hp) <l<ka 


it+j—kp—hg—6 
+ > |billdjleg” 


ka (kp t+ha—hp) <i+j,hp siske ip <j<kp 


4+ Gels tea he—ko— Sp 
D 
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+Gr > ae ; (45) 


hp <j<kp 


2.4 Algebraic and Related Properties of Operational Rules 
for Laurent Asymptotic Expansions 


Let us also introduce parameter w4 = k4 — ha, whichis a length of a Laurent asymp- 
totic expansion A(é) = ane +--+» + ape™ + o4(e). 
The following useful lemma takes place. 


Lemma 10.7 The following relations hold for Laurent asymptotic expansions pen- 
etrating Lemma 10.3: 


(i) If C(e) = cA(e), then wc = wa. 
(ii) If C(e) = A(e) + Ble), then wa A we < Wo < Wa V Wp. 
(iii) If C(e) = A(e) - Ble), then wc = wa A We. 
(iv) If C(e) = 1/B(e), then wc = wp. 
(v) IfD(e) = A(e)/B(e), then wp = Wa A Wp. 
(vi) [fwa = wg = w then wc = Wp = w for all Laurent asymptotic expansions pe- 
netrating Lemma 10.3. 


The proof of this simple lemma readily follows from formulas for parameters h 
and k penetrating propositions (i)-(v) of Lemma 10.3. 

Let us again consider four Laurent asymptotic expansions, A(e) = ap,¢" +++» + 
aye" + o4(e), Ble) = Dnge”® +--+ + bye + op(e™), Cle)=cneet+ es + 
CK eke + oc(e*), and D(e) = dy, é" +++ dy, € + op(e*?) defined for 0 < ¢ < 
€9, for some 0 < €9 < 1. 

Below, sums x d, are counted as Oif k < h. 

The following lemma is also a corollary of Lemma 10.3. 


Lemma 10.8 The summation and multiplication operations for the Laurent asymp- 
totic expansions penetrating propositions (ii) and (iii) in Lemma 10.3 possess the 
following algebraic properties, which should be understood as identities for the 
corresponding asymptotic expansions: 


(i) The summation operation is commutative, i.e., C(é) = A(e) + Ble) = Ble) + 
A(e), where he = harp = hpra = ha A hp, kc = kare = kpsa = ka A kp, and, 


kc—he 


Cle) = > (Ghett + Bic ere™! + oc(e®), (46) 
/=0 


where an,+1 = 0 for0 <1 < ha — he, bra.4i = 0 for 0 <1 < hg — he. 


172 D. Silvestrov and S. Silvestrov 


(ii) The summation operation is associative, i.e., D(e) = (A(e) + B(e)) + Cle) = 
A(e) + (Ble) + C(e)) = Ale) + Ble) + Cle), where = hp = harayec = 
haspec) = hassec = ha A hp A he, kp = Karpyec = karw@ic) = 
KAtBic = ka A kp A Kc, and, 


kp—hp 


D(e) = ~~ (np + Bip +t + Chp+dE?*! + op(e"”), (47) 
1=0 


where apy41 = Ofor0 <1 < ha — hp, bay41 = O for 0 <1 < hg — hp, Cny41 = 
0 for0 <1 <hc —hp. 

(iii) The multiplication operation is commutative, i.e, C(e) =A(e)- 
B(e) = Ble) - Ale), where hc = hag = hp.a = ha + hep, ko = kaw = kea = 
(ha + kp) A (ka + hg), and, 


ke—he 
ce= >F ( > suas] elct! + oc(e*), (48) 
+lz 


1=0 =1,l,,b>0 


(iv) The multiplication operation is associative, i.e., D(e) = (A(e) - B(e)) - Ce) 
A(e) - (Ble). C(e)) = Ae): Ble) - Cle), where hp = hya.z).c = hae.c) 
hap.c = ha +hpg+he, kp = ka.z).c = kae.c) = ka-p.c = (ha + hp + 
kc) A (ha + kp + he) A (ka + hp + he), and, 


kp—hp 


De= > 


hptl k 
Ghy +h PhptbChc+l | €'2™ + ope). (49) 
1=0 \4bth=hh.b./3>0 


7 


The summation and multiplication operations possess distributive property, 
ie, D(e) = (A(e) + Ble)) - C(e) = Ale) - C(e) + Ble) Cle), where hp = 
ha+s).c = ha.c+p.c = ha A hg +he = (ha + he) A (hg + he), 

kp = Kiara)c = kactp.c = (ha A hg + kc) A (ka A kg + he) = (ha + ke) 

A (ka + he) A (hp + ke) A (kp + he), and, 


(v 


kp—hp 
D(e) = > ( ~ (Aha Ahpt+h 


1=0  S+b=Lh,b>0 


; hp+l k 
+ Diy vhgth Yenc )é 2" + op(e™) 


kp—ha—he 
hathe-+l 
= ( » anichevt OM 7 
i=0 


Y+h=1,h,b20 
kp—hg—he 
2 hgt+hc+l k 
+ ? ( > Pipttichevt Jem "cr" + op(e). (50) 
1=0 +h=l,h,,b>0 


where Anyvhgtl = 9. forO <1 < ha — ha N hep, Day rngti = 9, for0 < 1 < hg — 
ha A hp. 


The summation and multiplication rules for computing of upper bounds remain- 
ders penetrating propositions (ii) and (iii) in Lemma 10.4 possess the communicative 
property. This follows from formulas (23) and (25) given in Lemma 10.6. 
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However, the summation and multiplication rules for computing of upper bounds 
for remainders presented in propositions (ii) and (iii) of Lemma 10.4 do not possess 
associative and distributional properties. The question about the form of upper bounds 
for the corresponding remainders, which would possess these properties, remains 
open. 

As follows from Lemma 10.4, operational rules presented in this lemma possess 
special property that let one give an effective low bounds for parameter 54 for any 
(ha, ka, 64, Ga, €4)-expansion A(¢) obtained as the result of a finite sequence of oper- 
ations (multiplication by a constant, summation, multiplication, and division) per- 
formed with (g,, ka,, 54;, Ga,;, €a,)-expansions A;(¢), i = 1, ..., N from some finite 
set of such expansions. 

The following lemma takes place. 


Lemma 10.9 The operational rules for computing remainders of asymptotic expan- 
sions with explicit upper bounds for remainders presented in propositions (ii) and 
(ili) of Lemma 10.4 possess the following properties: 


(i) If C(e) = A(e) + Ble) = Ble) + A(e) then 8¢ = 8448 = bp44, Go = Gare = 
Gp+a and €c = €4+B = &B+A, Where parameters 5c, Gc and &¢ are given by 
formula (14) in proposition (ii) of Lemma 10.4. 

(ii) If C(e) = Ae) - Ble) = Ble) - A(e) then dc = 84.3 = 53.4, Go = Gap = GB 
and &€¢ = €4.p = €p.4, Where parameters 5c, Gc and &¢ are given by formula 
(15) in proposition (iii) of Lemma 10.4. 

(iii) If A(e) is (ha, ka, 64, Ga, €4)-expansion obtained as the result of a finite 
sequence of operations (multiplication by a constant, summation, multiplica- 
tion, and quotient) performed with (ha,, ka,, 54;, Ga;, €4,)-expansi- ons Aj(€), i = 


1,...,.N from some finite set of such expansions, then 5, > dx = miny<j<n 44, 
that makes it possible to rewrite A(e) as the (ha,ka, dy, GAN» 
54—8% 


&,4)-expansion, with parameter G*, y = Gaé, 


3 Perturbed Markov Chains and Semi-Markov Processes 


In this section, we formulate basic perturbation conditions for Markov chains and 
semi-Markov processes and give basic formulas for stationary distributions for semi- 
Markov processes, in particular, formulas connecting stationary distributions with 
expectations of return times. 


3.1 Perturbed Markov Chains 


Let X = {1,..., N} and n®, n=0,1,... be, for every ¢ € (0, €9], a homogeneous 
Markov chain with a phase space X, an initial distribution p = Pini” =i,ieX 
and transition probabilities defined for i,j € X, 
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pyle) = Pin®, =j/n©® =i. (51) 


Let us assume that the following condition holds: 


A: There exist sets Y; C X,i € Xand €9 € (0, 1] such that: (a) probabilities pj(¢) > 
0,j € Yi,i € X, for e € (0, €0]; (b) probabilities pj(e¢) = 0,7 € Y;,i € X, for 
€ € (0, €o]; () there exist nj; = 1 anda chain of states i = lio, li... lijny =J 
such that Jj; € Y, sling € Yi. for every pair of states i,j € X. 


HOI OS ij.mg—1? 


We refer to sets Y;, i € X as transition sets. 

Conditions A implies that all sets Y; 4 , i € X, since matrix ||pj(€)|| is stochas- 
tic, for every € € (0, &o]. 

We now assume that the following perturbation condition holds: 


+ 
B: pjj(e) = ye aj(le! + oi(el"), where ajjlli | >0Oand0< li < i < oo, for 
JE Yi,i€ X, and o,(e! )/el” —> Oase > 0, forj € Y;,i € X. 


Some additional conditions should be imposed on t Paras &9 € (0, 1] and 
lig od € Y;,i € X, and coefficients aj[/],/ = [iy » coe li ad € Y;,i € X, penetrating 
the asymptotic expansions condition B, in order this condition would be consistent 
with the model assumption that matrix ||p;;(€)|| 1s stochastic, for every ¢ € (0, eo], 
and with condition A. 

Condition B implies that there exits ¢9 € (0, 1] such that the following relation 
holds, 


+ 
Uj 


pi(é) = aye’ + oj(€"") >0,7¢€Y;, 1¢ X, ¢ € (0, eo]. (52) 
l=1; 


Thus, condition B is consistent with condition A (a). 
The model assumption that matrix ||pj(¢)|| is stochastic is, under conditions A 
(a) and (b), equivalent to the following relation, 


> pile) =1, jE Yi,i€ X, ¢ € (0, eo). (53) 
Jey; 


Condition B and proposition (i) (the multiple summation rule) of Lemma 10.5 
imply that sum >° jez, Pij(€) Can, for every subset Z C Y; and i € X, be represented 
in the form of the following asymptotic expansion, 


ne 
> pile) = > aiallle! + 04.2(e"), (54) 
jeZ l= ey 


where 


Oe Qi, = min Jj (55) 


ij? 
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and 
az => oll, b= tpi .csall ge (56) 


where a,[/] = 0, forO <1 < lig oJ € Z, and 


onze) = >) | >) ayllle! + oi(e") | . (57) 


jeZ \it <lsip 


In terms of asymptotic expansions, constant | can be represented, for every n = 
0, 1,..., in the form of the following pivotal (0, 1)-expansion, 


1=14+0e+---0e"+0,(e"), (58) 


where remainders 0, (€”) =0,n =0,1,.... 

Moreover, the above expansion is a (0, n, 1, G, €9)-expansion for any 0 < G < oo 
andn=0,1,.... 

Relation (53) permits us apply Lemma 10.2 to the asymptotic expansions given 
in relations (54) and (58). Not that, in this case, liy, = 0, otherwise the expression 
on the right hand side in (54) would converge to zero as ¢ —> 0. Let us take n = iy, 
in relation (58). In this case hy = 0 and kj = Ly in the asymptotic expansion given 
in relation (58). , 

Lemma 10.2 and the model stochasticity assumption (53) imply that, in this case, 
the following condition should hold for the coefficients of asymptotic expansions 
penetrating condition B: 


C: (a) ay) = Dyey, all] = 10 = 0), O< I <I, ie X, where all] = 0, 
for0 <1 <15,j € Yi, € X; b) oy, (€") = op, (e') = 0,0 € X. 


Remark 10.1 It is possible to prove that conditions A—C and the model stochas- 
ticity assumption (53) imply that the asymptotic expansion in (54) satisfy, for 
every Z C Y; and i € X, one of the following additional conditions: (a) [; 7 > 0; 
(b) 1,7 = 0, aj,z[0] < 1; (c) liz, = 0, a;,z[0] = 1 and there exists 0 < he < i 
such that a;.z[/] = 0,0 <1 < Iz, but q;,z[l;,z] < 0; or (d) 1; 2 = 0, a;,z[0] = 1 and 


2 ar =z re <i 
a;.z{l] =0,0<I1< La but the remainder oi.z(€"«2) is a nonpositive function of ¢. 


ee ‘ ? he 
The above proposition implies that there exists ¢9 € (0, 1] such that pas aiz, []e! 
aL 


+0; 2(e%2) <1,Z C Y;,i¢€ X,e € (0, €9]. Thus, conditions A-C are also consis- 
tent with the relations Diez Pi (6) <1,ZCY;,ie X,e € (0, €9], which follows 
from the model stochasticity assumption (53). 

In the case, where the asymptotic expansions penetrating condition B are supposed 
to be given in the form of asymptotic expansions with explicit upper bounds for 
remainders, we replace it by the following stronger perturbation condition: 


176 D. Silvestrov and S. Silvestrov 


[ 
B’: pi(e) = oe alle! + 0; (el), where all, |] > OandO0 <1; < i < ©, for 


. . ; i j j 
fe Mite % na Joye" )| < Gye! **, 0 <eé<e,, forj ¢ Y;,i € X, where 
0 < 6; < 1,0 < Gj < coand0 < ey < &. 


3.2 Perturbed Semi-Markov Processes 


Let X = {1,..., N} and (n®, «),n = 0, 1,... be, for every ¢ € (0, 1], a Markov 
renewal process, i.e., ahomogeneous Markov chain with the phase space X x [0, 00), 
an initial distribution p = P{n® = i, «( = 0} = P{n® = i}, i € X and transi- 


tion probabilities defined for (i, s), Vj, t) € X x [0, oo), 


OF (0) = Plt =F. ker St/n? = i, = 5}. (59) 


In this case, the random sequence n®, n=0,1,... is also a homogeneous 
(embedded) Markov chain with the phase space X and transition probabilities defined 
for i,j € X, 


nO=Pa. 1 =a = =O, (oo). (60) 


We assume that condition A holds. This implies that Markov chain 7‘ has one 
class of communicative states, for every ¢ € (0, éo]. 
We also assume that the following condition excluding instant transitions holds: 


D: Q;(0) = 0, i,j € X, for every ¢ € (0, eo]. 


Let us now introduce a semi-Markov process, 
) 
nO) = Mag: t= 0. (61) 


where 
vr) = max(n > 0:6 <2), t>0, (62) 


is a number of jumps in the time interval [0, ¢], and 
CO = KO 4-.-4+4, n=0,1,..., (63) 


are sequential moments of jumps for the semi-Markov process y(t). 

If Oi (0) = 1 — e *)pi(e), t = 0, i,j € X, then n© (1), t > Ois a continuous 
time homogeneous Markov chain. 

If OF) @) = M(t = Vpj(e).t = 0, i,j € X, then n (1) = nj), t = 0 is a discrete 
time homogeneous Markov chain embedded in continuous time. 

Let us also introduce expectations of sojourn times, 
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(e) i ) 
ex(e) = Ex en? =D = [ wPUa, 17 €X. (64) 
0 


We also assume that the following condition holds: 
E: ej(€) < 00, i,j € X, fore € (0, €0]. 


Here and henceforth, notations P; and E; are used for conditional probabilities 
and expectations under condition a =i. 

In the case of continuous time Markov chain, ej(¢) = cw (e), 1,7 € X. 

In the case of discrete time Markov chain, ej(¢) = pi(e), i,j € X. 

Conditions A and D imply that, for every ¢ € (0, €0], expectations e;(¢) > 0, for 
Jj € Y;,i € X, and ej(e) = 0, forj € Y,,ieX. 

We now assume that the following perturbation condition holds: 

mr + 
F: ej(€) = i bile! + 0,(e"), where b,[m;, 


= ~]>0 and —co<m, < 
l=m, yy ij 


uy 
ae 


. . . ne + . . 
m; < 00, forj € Y;,i € X and oj(e")/e"” + Oase > 0, forj € Y;,i € X. 


In particular, in the case of discrete time Markov chain, condition B implies 
condition F to hold, since, in this case, expectations ej(¢) = py(e),j € Yi, i € X. 
Condition F implies that there exits ¢9 € (0, 1] such that the following relation 
holds, 
ejy(e) > 0, 7 € Y;, i€ X, € € (0, eo]. (65) 


This is consistent with condition D. 

In the case, where the asymptotic expansions penetration condition F are given 
in the form of asymptotic expansions with explicit upper bounds for remainders, we 
assume that the following stringer perturbation condition holds: 


F: eg(e) = - billle! + og(e™'), where byll;]>0 and —oo <m; < 
+ 
m; 


Y;,i € X, where 0 < 8; < 1,0 < Gj < co and0 < &j < £0. 


mj +6 


. +: = . . 
< oo, for j € Y;,i€ X, and |d;(e"")| < Gye "O0<e<&j, forje 


It is also worse to note that the perturbation conditions B and F are independent. 

To see this, let us take arbitrary functions p,(¢),j € Y;,i ¢ X and ej(e),j € 
Y;, i € X satisfying, respectively, conditions B and F, and, also, relations (52), (53) 
and (65). Then, there exist semi-Markov transition probabilities or } (1),t>O,j¢€ 
Y;, i € X such that QF (co) = pi(e),j € Vi,i€ X and {5° 104 (dt) = ej(e).j € 
Y;, i € X, for every ¢ € (0, €o]. 

It is readily seen that, for example, semi-Markov transition probabilities OF (t) = 
I(t > ej(€)/piy(€))pi(), t = 0,7 € Yi, i © X satisfy the above relations. 
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3.3 Stationary Distributions for Semi-Markov Processes 


Condition A guarantees that the phase space X is one class of communicative states 
for Markov chain n, for every € € (0, 0], i-e., the Markov chain n is ergodic, 


and, thus, for every € € (0, €o], there exist the unique stationary distribution p(¢) = 
(p\(€),..-, Pn(€)), Which is given by the following ergodic relation, for i € X, 


1 n 
i) = = 1? =i) + pile) asn > ov. (66) 
° n 
k=1 


It is useful to note that the ergodic relation (66) holds for any initial distribution 
p® = (p\,...p) and the stationary distribution j(¢) does not depend on the 
initial distribution. 

As known, p;(¢), i € X is the unique positive solution for the system of linear 
equations, 

| P(E) = Dicx pi(e)pi(€), 7 € X, (67) 
Diex hi = 1. 


Conditions A, D and E imply that, for every ¢ € (0, €], the semi-Markov process 
n(t) is also ergodic and its stationary distribution 7(¢) = (7) (e),..., v(e)) is 
given by the following ergodic relation, for i € X, 


-(€) = 1 ; (e) _ P 
Bh; (t)= F Tin? (s) = Dds —> m(e) ast > ow. (68) 
0 


As in (66), the ergodic relation (68) holds for any initial distribution p© and the 
stationary distribution z (¢) does not depend on the initial distribution. 

The stationary distributions for the semi-Markov process 7“) (t) and the embedded 
Markov chain 7°) are connected by the following relation, 


pileyei(é) 
i(€) = ; 69 
aaa SEO PTC “ 
where 
ei(e) = Ej}? = > ej(e), ie X. (70) 


jeX 
Condition B implies that there exist limits, 


aj[0] ifl; =0,j€ Yi,i€X, 
pO) = lim pyle) =} 0 itl; > OFEV,iEX, (71) 
~~ 0 iyfe Y,ie x. 
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Matrix ||p;(€)|| 1s stochastic, for every € € (0, €9] and, thus, matrix ||p;(0)|| 1s 
also stochastic. 

However, it is possible that matrix ||p;;(0)|| has more zero elements than matrices 
Ipi(e)I- 

Therefore, a Markov chain 7, n=0,1,..., with the phase space X and the 
matrix of transition probabilities ||p,(0)|| can be not ergodic, and its phase space X 
can consist of one or several closed classes of communicative states plus, possibly, 
a class of transient states. 

Condition F implies that there exist limits, 


le) ifm; <0,j € Yi,i€ X, 
bj [0] ifm; =0,j € Yi,i€ X, 
ifm; > 0,7 € Y;,i€ X, 
ifje Y;,i1¢ X. 


€i(0) = lim ey(€) = (72) 


0 
0 


Out goal is to design an effective algorithm for constructing asymptotic expansions 
for stationary probabilities 2;(¢), i € X, under assumption that conditions A-F hold. 

As we shall see, the proposed algorithm, based on a special techniques of sequen- 
tial phase space reduction, can be applied for models with asymptotically coupled 
and uncoupled phase spaces and all types of asymptotic behavior of expected sojourn 
times. 

The models of continuous and discrete Markov chains are particular cases. 

In particular, asymptotic expansions for stationary probabilities p;(¢), i € X coin- 
cide with expansions for stationary probabilities 77;(¢), i € X, for the discrete time 
Markov chain, where expectations ej(€) = pi(e), i,j € X. 


3.4 Expected Hitting Times and Stationary Probabilities 
for Semi-Markov Processes 


Let us define hitting times, which are random variables given by the following rela- 
tion, for j € X, 


= Se (73) 
n=1 
where 
vi) = min(n > 1: = J). (74) 
Let us denote, 
BO=En ies. (75) 


As is known, conditions A, D and E imply that, for every « € (0, €o], 
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0 < Ej(e) < wo, ij eX. (76) 


Moreover, under the above conditions, the expectations E;;(¢), i € X are, for every 
j € X, the unique solution for the following system of linear equations, 


{Ej(e) = e;(€) + > pir (@Ei(2), ieX. (77) 
rAj 


The following relation plays an important role in what follows, 


ee), 
Ej(€)’ fo 


n(€) = (78) 

In fact this formula is an alternative form of relation (69). Indeed, as is known, 
Ej(€) = Dex ej(e)fuj(e), where fii ;(€) is the expected number of visits by the 
Markov chain n) the state j between two sequential visits of the state i. As also 
known, fii j(€) = pj(e)/pi(é), i,j € X. 

Formula (78) permits reduce the problem of constructing asymptotic expansions 
for semi-Markov stationary probabilities z;(¢) to the problem of constructing Laurent 
asymptotic expansions for expectation of hitting times E;;(¢). 


4 Semi-Markov Processes with Reduced Phase Spaces 


In this section, we present a procedure for one-step procedure of phase space reduc- 
tion for semi-Markov processes and algorithms for re-calculation of asymptotic 
expansions for perturbed semi-Markov processes with reduced phase spaces. 


4.1 Reduction of a Phase Space for Semi-Markov Process 


Let us choose some state r and consider the reduced phase space ,X = {i € X,i £7}, 
with the state r excluded from the phase space X. 

Let us assume that p) = P{n\ = r} = 0 and define the sequential moments of 
hitting the reduced space ,X by the embedded Markov chain ©? 


nm? 


€© = mink > -€©,, n® € -X), n=1,2,..., €° =0. (79) 


Now, let us define the random sequence, 


(é) re (é) 
. fe = 152.52. 
ic (e) ()) _ | 1 06 , bid O41 Ky ) orn +4 , (80) 


mn Lai?.0) forn =0 
No > orn = 0. 
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This sequence is also a Markov renewal process with a phase space ,X x [0, 00), 
the initial distribution p\ = P{n(? = i}, i € -X (remind that p® = 0), and transi- 
tion probabilities defined for (i, s), G, t) € -X x [0, 00), 

iO O=hin Se St, Shee =), (81) 

Respectively, one can define the transformed semi-Markov process with the 
reduced phase space ,.X, 

MO) = Msgs 12 0. (82) 
where 
~v©(f) = max(n > 0: Das <t), t>0, (83) 


is a number of jumps at time interval [0, ¢], and 
CO = KO 4.--4 «©, n=0,1,..., (84) 


are sequential moments of jumps for the semi-Markov process ,7 (f). 
The transition probabilities Oe (t) are expressed via the transition probabilities 


() 


ij (t) by the following formula, for i,j € -X, ft > 0, 


OP O=2°O+ > 0 Oso" + 070. (85) 


n=0 


Here, symbol * is used to denote a convolution of distribution functions (possibly 
improper) and Q“)*"(r) is the n times convolution of the distribution function Q (f) 
given by the following recurrent formula, for r € X, 


fo O2*"-Y a — s)O (ds) fort > Oandn > 1, 


I(t > 0) fort > Oandn=0. ey) 


oo) = | 


Relation (85) directly implies the following formula for transition probabilities 
of the embedded Markov chain ,n, for i,j € -X, 


rPy(€) = OQ, (0) 


— pi(€) + > Pir (©)Pr(€)"Pri(©) 


n=0 


pri (é) 


= pyle) + Pir) ne (87) 


Let us denote, 
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={feY,:jA#r}, ireX. (88) 
and 


Yi =e Bir eV, fe VY, ie x (89) 


Condition A implies that sets Y* 4 O,r € X. 
Thus, probabilities 1 — p,,(¢) > 0, r € X, for every ¢ € (0, €o]. 
That is why, 


rYi= {i € rX: pyle) > 0,€ € (, e0]} 
={fje X:jeY}Ue -X:reYi,je Y,}, 
=YUY,,7€ & (90) 


ir? 


Relation (87) and condition A, assumed to hold for the Markov chain n, imply 
that condition A also holds for the Markov chain ,. n®, with the sets -Y;,i € ,X. 

Indeed, letie ,X. If j€ ¥ then pj(¢) > 0 and, thus, ,pj(e) > 0. If j € Y;, 
then pi-(€), p,j(€) > 0 and, again, ,pj(e) > 0. If j ¢ Yj; UY;, then pj(e) = 0 and 
Pir(€) + p,j(€) = 9. By relation (87), this implies that ,pj(¢) = 0. 

Letie ,X. If yt # Wthen,Y; 4G. If Y} = @thenr € Y; and, thus, p;,(e) > 0. 
Then, Y;, = {j € -X: p,j(e) > 0} = Y* AG. Therefore, sets -Y; 4 ,i € -X. 

Thus, conditions A (a) and (b) assumed to hold for the Markov chain n®, imply 


that these conditions also hold for the Markov chain Ee with sets ,Y;,i€ -X 
replacing sets Y;,i € X. 


Also, let i,j € -X and i=/p,h,,..., ln; = Jj be a chain of states such that 
heYy,,....h€ Yi, _,- AS was remarked above, we can always to assume that 
states /},..., lng -1 are different and that ne Inj —1 i,j. This implies that either 
l,.--,ln,-1 AT or there exist at most one 1 <k <nj—1 such that % =r. In 
the first case, J) € -Y,,.--4 Ing € PY y= _,- In the second case, J) € -Y,,...,4-1 € 
rio Ik-4 e rYiay er larieiage Inj € PY by . 


Thus, condition A (c) assumed to hold for the Markov chain n®, imply that this 
condition also holds for the Markov chain ,n. 
Let us define distribution functions, 


FPO => OPW, t= 0, ij eX. (91) 
jExX 
and 


(e) : 
FOG) = | O' (t)/py(e) for t > 0 if py(e) > 0, 166 


FQ) fort > Oif pj(e) = 0. 


Obviously, 
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~_ [~ ne@ran J eg(e)/pile) if py(e) > 0, 
ej (€) -/ tF, (dt) = | u Be if py(e) ~ 0. (93) 
and - 
e(€) = | tF (dt), ie X. (94) 
0 


Also, let us introduce expectations, 


oe) 
rej (€) = | tO, (dt), i,j € ,X. (95) 


Relation (85) directly implies the following formula for expectations ,e,(€), i, 
Je rX, 


(oe) 


rey(E) = eg (€)py(€) + > (€ir(€) + né(€) + &,(€)) Pir (©)Pr (€)" Dri (€) 
n=0 


Pri (€) 
]:— Pir(€) 

Pir(€) Pri) Pir(€) 
f=p50150@ loge 


=e;(€) + éir(€) 


+ err (€) 


(96) 


Relation (96) implies that conditions D and E, assumed to hold for the semi- 
Markov process 7) (t), imply that these conditions also hold for the semi-Markov 
process ,n)(f). 


4.2 Hitting Times for Reduced Semi-Markov Processes 


The first hitting times to a state j 4 r are connected for Markov chains 7 and ,n 
by the following relation, 


vy?) = min(n > 1:7© =) 
= ming£© > 1: 7 =p) = £%, (97) 
Vj 
where 
(e) _ ‘ (6) _ ; 
ry = min(n > 1: ny? = j). (98) 


Relations (97) and (98) imply that the following relation hold for the first hitting 
times to a state j A r for the semi-Markov processes y(t) and ,n© (1), 
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=> n= 9 oD 


Let us summarize the above remarks in the following theorem, which play the 
key role in what follows. 


Theorem 10.1 Let conditions A, D and E hold and the initial distribution satisfies 


the assumption, p© = 0, for every € € (0, &9]. Then, for any state j # r, the first 
hitting times and a 


n® (t) and -n© (ft), coincide. 


to the state j, respectively, for semi-Markov processes 


4.3 Asymptotic Expansions for Non-absorption Probabilities 


As was mentioned above, condition A implies that the non-absorption probability 
Pile) = 1— pile) > 0,1 € X,€ € (0, e0]. 
Let us introduce the set, 
Y={ieX:ie Yj}. (100) 


Algorithm 1. This is an algorithm for constructing asymptotic expansions for 
non-absorption probabilities p;(e), i € X. 

Case 1: i € Y. 

Let us use the following relation, which holds, for every i € Y and « € (0, é9], 


pile) = 1 — pile) = >> pile) (101) 


jeYi 


1.1. To construct the (h;, k;)-expansion for the non-absorption probability p;;(¢) = 
1 — p,(e) by applying the propositions (i) (the multiplication by a constant rule) and 
(ii) (the summation rule) of Lemma 10.3 to the (/;, I)-expansion for transition 
probability p;;(¢) given in condition B (first, this expansion is multiplied by constant 
—1 and, second, is summated with constant | represented as (0, [)-expansion given 
in relation (58)). In this case, parameters h, = 0, k; = I. 

1.2. To construct the (h’, k’)-expansion for the non-absorption probability 
Diil€) = Devt pi(€) using the corresponding asymptotic expansions for transition 
probabilities pj(e), j € Y} given in condition B, and the proposition (i) (the multi- 
ple summation rule) of Lemma 10.5. In this case, parameters h/ = minjey+ J, ki’ = 
minjey+ i. This asymptotic expansion is pivotal. 
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1.3. To construct the (hj, k;)-expansion for the non-absorption probability pji(¢) 
using relation (101), and propositions (i)—(iv) of Lemma 10.1. In this case, parame- 
ters h; =OVAL Hh! ka RvR HEV min;cy+ 7. This asymptotic expansion 
is pivotal. 

It should be noted that (J; , I; )-expansion for the transition probability p,j(e) 
given in condition B and (h’, k;’)-expansion for function p;(¢) = >~ jevt Pi (€) given 
in Step 1.2, satisfy, for every i € X, additional conditions given in Remark 10.1, 
respectively, for set Z = {i} and set Z = Y}. 

Case 2: i € Y. 

1.4. In this case, the non-absorption probability p;;(e) = 1. If necessary, it can 
be represented in the form of (0, m)-expansion given by relation (58), for any n = 
(Oe eee 

The above remarks can be summarized in the following lemma. 


Lemma 10.10 Let conditions A, B and C hold. Then, the asymptotic expansions for 
the non-absorption probabilities pi(e), i € X are given in Algorithm 1. 


Algorithm 2. This is an algorithm for computing upper bounds for remainders of 
asymptotic expansions for non-absorption probabilities p;;(e), i € X. 

Case 1: i € Y. 

2.1. To construct (h;, k;, 5, Gj, e;)-expansion for the non-absorption probability 
Pii(€) = 1 — p;(e) by applying the propositions (i) (the multiplication by a constant 
rule) and (ii (the summation rule) of Lemma 10.4 to the (/;;, i Ou, Gi, €4)-eXpansion 
for the transition probability p;;(€) given in condition B’ and (first, this expansion is 
multiplied by constant —1 and, second, is summated with constant | represented as 
(0, Ee, 1, G, €9)-expansion given in relation (58)), third, constant G can be replaced 
by 0, since it can be taken an arbitrary small. In this case, parameters 6: = 6, G, = 
Gii, &) = &jj. 

2.2. To construct the (h7, ki’, 6;’, G/, e/’)-expansion for the non-absorption prob- 
ability pj(e) = Dijevt pij(€) using the (li » i, 5;, Gi, €)-expansions for transition 
probabilities p;(e), jj € Y; given in condition B’, and the proposition (i) (the multi- 
ple summation rule) of Lemma 10.6. In this case, parameters 5/’, G’’, e,’ are given by 
the corresponding variant of relation (24). 

2.3. To construct the (hj, ki, bi, G;, &€;)-expansion for the non-absorption proba- 
bility p,;(€) using relation (101), and proposition (i) of Lemma 10.2. In this case, 
parameters 6;, G;, €; are given by the corresponding variant of relation (4). 

Case 2: i < Y. 

2.4. In this case, the non-absorption probability p,(¢) = 1. If necessary, it can be 
represented in the form of (0, , 1, G, €9)-expansion given by relation (58), for any 
0<G<oandn=0,1.,.... 


The above remarks can be summarized in the following lemma. 


Lemma 10.11 Let conditions A, B' and C hold. Then, the asymptotic expansions for 
the non-absorption probabilities p;;(€), i € X with explicit upper bounds for remain- 
ders are given in Algorithm 2. 


186 D. Silvestrov and S. Silvestrov 


4.4 Asymptotic Expansions for Transition Probabilities 
of Reduced Embedded Markov Chains 


Relation (87) can be re-written in the following form more convenient for construct- 
ing asymptotic expansions for probabilities ,pj(e€), i,j € +X, 


py(©) + pir(e) 2. itj e YENY; 


1—p,(€) ir ir? 
pi(e) = pi(€) ifj € Yir guar (102) 
a Pile) if eV, NY;,, 
0 ifj € % ny, — Vis 


Algorithm 3. This is an algorithm for constructing asymptotic expansions for 
transition probabilities ,py(e), i,j € +X. 

Case 1: r € Y. 

3.1. To construct (hij, kyj)- -expansions for conditional probabilities p,;(¢) = 


Pril€) — ot 
1—p;r(é)” Jé Yi ij? Li 


incondition B, the (,, k,)-expansion for the non-absorption probability p,,(e) = 1 — 
Prr(€) given in Algorithm 1, and the proposition (v) (the division rule) of Lemma 10.3. 
In this case, parameters hy = — Li —h,, ky = = —h r) A Ci Ah — 2h, ) Je Yt 
These asymptotic expansions are pivotal. 

3.2. To construct (hj,;, ahaa for products pj;(€) = pir(€)P,j(€) = Pir(€) 
Pale) ic Yr 


= ’ J — ip? ’ 
1—prr(e) ir ie ir 
Pir(€) given in condition B, the (hi, k,;)-expansions for conditional probabilities 
P,j(€) given in the above Step 3.1 and the proposition (iii) (the multiplication rule) 
of Lemma 10.3. In this case, parameters hig =i + hy, kei = (+ kj) A Ge + 
hg), 5 Yi, hese i 
3.3. To construct (hj;, ki,;)-expansions for sums Dirj (€) = pi(€) + pir (€) = 


pij(€) + pire) - fa. je Y{inyY,,, ie -X, using the (lis Ui) expansions for 


transition probabilities pi(€) given in condition B, the (hirj, kinj)- expansions for 
quantities p;,;(€) given in the above Step 3.2 and the proposition (ii) (the summation 


using the (/ )-expansions for transition probabilities p,;(€) given 


rr? 


ié ,X, using the (J-, /*)-expansions for transition probabilities 


;» | € ,X. These asymptotic expansions are pivotal. 


ir? 


tule) of Lemma 10.3. In this case, parameters hi = = li A Hin, kin = lj A Kein Jé 
Y;,. i € -X. These Bey mpwle expansions are pivotal. 


3.4. To construct (,1,, yor i)- expansions for transition probabilities ,pj(e) = 
+ 


rl 
ue rr rayll Je! + o(e" i), i,j € -X, using the (/ I, )-expansions for transition 


ij? 
probabilities pj(€) given in condition B, the (hirj, kinj)-expansions for quantities 
Dirj(€) and (h ae kin) -expansions for quantities Pir (€) given, 
respectively, in the above Steps 3.2 and 3.3, and the corresponding variants of 
formulas for transition probabilities rpij(€) given in relation (102). In this case, 
parameters ,/;, = hing, = a kj if je Vi NY,, té -X, or rl, = 1, ry = i; 


ir? 
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if je VINY,, i¢€ -X, or dy =hiy, lt = hg if je Y, OY; 
asymptotic expansions are pivotal. 

Case 2: r ¢ Y. 

3.5. The corresponding algorithm is a particular case of the algorithm given in 
Steps 3.1—3.4. In this case the non-absorption probability p,,(¢) = 1 — p,-(e) = 1 


and, thus, conditional probabilities p,;(¢) = “ a = = p,(e),j € Y*. This permits 


ie ,X. These 


ir? 


one aeplace the (h,j His ky k,;)-expansions for conditional probabilities p,;(¢) by the 
(ij. [s)- expansions for transition probabilities p,;(e). This is the only change in 
the algorithm for construction of asymptotic expansions for transition probabilities 
rpi(€), i,j € -X given in Steps 3.1-3.4, which is required. 

The above remarks can be summarized in the following theorem. 


Theorem 10.2 Conditions A, B and C assumed to hold for the Markov chain 
n, also hold for the reduced Markov chain ,n©, for every r € X. The asymp- 
totic expansions penetrating condition B are given for transition probabilities 
rpi(€),J € rYi, 1€ -X,r € Xin Algorithm 3. 


Algorithm 4. This is an algorithm for computing upper bounds for remainders in 
asymptotic expansions | for transition probabilities ,pj(e), i,j € +X. 
4.1. To construct (hy; a kys by, C j, €,;)-expansions for conditional probabilities 
Dr (E) 


Pile) = mare Jé€ Ye using the Ci Ds 5,;, G,j, €,;)-expansions for transition 


probabilities p,;(e) given in condition B’, the (h, ; Res b,j a Ga: €,;)-expansion for the 
non-absorption probability p,,(€) = 1 — p;,(€) given in Algorithm 2, and the propo- 
sition (v) (the division rule) of Lemma 10.4. In this case, parameters 4,;, G,j, Ej, J € 


Y+ are given by the corresponding variants of relation (17). 


4.2. To construct (hirj, Kinjs 8irjs Giri. Biri) expansions for products Dirj(€) = 
Pirl€)py(é) =pirle) -#@., je Y;z, ie -X, using the (IF, It, bir, Gir, eir)- 


1—py, -(€) ? ir? “ir? 

expansions for transition probabilities p;-(e) given in condition B’, the (hij, Kj 
5,;, G,j, €,j)-expansions for conditional probabilities p,;(¢) given in the above Step 
4.1 and the proposition (iii) (the multiplication rule) of Lemma 10.4. In this case, 
parameters 4j,;, Gin, 2 inj, J € Y;,, i € -X are given by the corresponding variants of 
relation (15). 

4.3. To construct (hirj, Kirj, dirj, Girj, €irj)-eXpansions for sums pj,,;(€) = 

i (E) : ; 

Di(€) eee) = pile) + pir(é) - on Jé€ nee NY;, i¢€ -X, using the 
Gy. cr, dij Gj, €)-expansions for transition probabilities pj(¢) given in condition 


ir? 


B’, the (hirj, Kiri Siri. Gir, &;,;)-expansions for quantities p;,;(¢) given in the above 
Step 4.2 and the Proposition (ii) (the summation rule) of Lemma 10.4. In this case, 
parameters Sin, Gi, Einj, J €Y;,., i€ ,Xare given by the corresponding variants of 
relation (14). 

4.4. To construct (,l7 por i, rOij, rGij, r€ij)-expansions for transition probabil- 


ir? 


: AS 
ities ,py(€) = D- ,- raijll Je! + o(e" Wry, i,j € >X using the Uy. Uy i , Oy, Gy, &ij)- 
Mi 


expansions for transition probabilities pj(¢) given in condition B’, the (hirj, inj 
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bin Gir, &;,;)-expansions for quantities p;,;(¢) and (hirj, ini Sis Gir, &iyj)-expan- 
sions for quantities p;,;(€) given, respectively, in the above Steps 4.2 and 4.3, 
and the corresponding variants of formulas for transition probabilities rpi(€) given 
in relation (102). In this case, parameters ,4; = bin ,Giy = Giri, ej = big ifje 
Yin OY; ie ,X, or ,dy = by, -Giz = Gy, rej = €j if J e YiNY,, 
Oi = bi, ,Gi = Giri, réij = bir if j € Y,, NY,, ie -X. 

Case 2: r < Y. 

4.5. The corresponding algorithm is a particular case of the algorithm given in 
Steps 4.1-4.4. In this case the non-absorption probability p,,(¢) = 1 — p,-(e) = | 


and, thus, conditional probabilities p,;(¢) = “ fae 


one replace the (hij, kj, bij, Gy, &,;)-expansions for conditional probabilities p,;(¢) 
by the (ij. i a 5,j, G,j, €,;)-expansions for transition probabilities p,;(¢). This is the 
only change i in the algorithm for construction of asymptotic expansions for transition 
probabilities ,pj(€), i,j € -X given in Steps 4.14.4, which is required. 

The above remarks can be summarized in the following theorem. 


ie ,X, or 


ir? 


ir? 


= p,(€),i € Y*. This permits 


Theorem 10.3 Conditions A, B’ and C assumed to hold for the Markov chain n?, 
also hold for the reduced Markov chain ,n®, for every r € X. The upper bounds for 
remainders in asymptotic expansions penetrating condition B’ are given for transition 
probabilities ,py(e),j € +Yi, i€ -X,r € Xin Algorithm 4. 


4.5 Asymptotic Expansions for Expectations of Sojourn 
Times for Reduced Semi-Markov Processes 


Relation (96) can be re-written in the following form more convenient for construct- 
ing the corresponding asymptotic expansions probabilities ,ej(¢), i,j € +X, 


Pri(€) 


ej (€) + Cir (€) 1—p,,(e) 

Pile) _Pyi(€) 
+ €,(€) 1=p,.(é) nel 
+e,;(€) os eae 
Bate) Hype yon y., 

eyle) = ee) — = 

eir(E) I—pr-(€) 

Pirl€)_ _Pyi(€) 
+ é,(€) 1—p,(€) ip) 
+e,;(€) tO if j i Y;, Nn Y;, , 
0 ifje Y;, a on 


Algorithm 5. This is an algorithm for computing asymptotic expansions for 
expectations ,ej(€), i,j € +X. 
Case 1: r € Y. 
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5.1. To construct (hj, ki-)-expansions for quantities pj-(e€) = bas = 7 an 
ie ,X, using the (/;,, I*)-expansions for transition probabilities p;-(e) given in 
condition B, the (h,, k; )-expansion for the non-absorption probability p,(¢) = 1 — 
Prr(€) given in Algorithm 1, and proposition (v) (the division rule) of Lemma 10.3. 
In this case, parameters hj, = hj, — hy, Kir = (Rir + Ky — 2hp) A (Kip — hy), ie >X. 
These asymptotic expansions are pivotal. 


5.2. To construct (hj, kirj)- expansions for products Pirj(€) = pir(e)py(e) = = 


Dir(€) Pr (€) : . 
1—p,,(€) 1—p,r(€)’ J Ye 


5.1, the (h,;, k,;)-expansions for conditional probabilities p,;(¢) = a given 
in Step 3.1 of Algorithm 3, and the proposition (iii) (the multiplication rule) 
of Lemma 10.3. In this case, parameters hin = hip + hy, k a = (hi + kj) A (Kip + 

hg), §€ Ys 


5.3. To construct (hing, kinj)-expansions for products @,;(€) = eir(€)py(€) = 


inj? 
ieé ,X, using (hir, kir)- -expansions given in the — Step 
Pale 


i € ,X. These asymptotic expansions are pivotal. 


ir? 


eir(e) pn j¢Y;,, i€ ,X, using the (m;,, m})-expansions for expectations 


eir(€) given in condition F, the (hi. k,;)-expansions for conditional probabilities 


> Di (E) 
Py) = © 


multiplication rule) of Lemma 10.3. In this case, parameters hirj =m, + hy, kinj = 


given in Step 3.1 of Algorithm 3, and the proposition (iii) (the 


(m,, + kj) A (mi + hy), jé Y;,, | € ,X. These asymptotic expansions are pivotal. 


5.4. To construct (hirj. kinj)-expansions for products é;,;(€) = e(€) -Pij(€) = 


Pir(€) Pil) . — 
err) 75,6) T=pn@y’ JE Vir 


tations e,,(€) given in condition F, the (hirj, Kinj)-expansions for quantities pj,;(€) 
given in the above Step 5.2, and the proposition (iii) (the multiplication rule) of 


ié ,X, using the (m;, m*)-expansions for expec- 


rr? 


Lemma 10.3. In this case, parameters hig =m,,+ hing, kins = (mm, + kirj) A (nit + 


hinj) J € Y;., i € ,X. These asymptotic expansions are pivotal. 


ir? 
5.5. To construct (hirj, kinj)-expansions for products einj(€) = e,;(€) Pir(é) = 


Pir) Ee 
erj(E) To, (e)? H) is Yirs 


ie ,X, using the (m, m,; )-expansions for expectations 


ry? 
e,;(€) given in condition F, the (hir, ki)-expansions for quantities p;,(€) given in the 
above Step 5.1, and the proposition (iii) (the multiplication rule) of Lemma 10.3. In 


this case, parameters hing = = m,; + hi F kin = = (m,; + Kir) A (m5 + hir), JEY;, 1€ 
+X. These asymptotic expansions are pivotal. 


5.6. To construct (hinj, kij)-expansions for sums @j,;(€) = éj,;(€) + fini(€) + 


~ Pri (€) _Pirl€)_ _ Pi(€) ; Pir(€) ses eee 
iri) = Cir (Te + rl), Tyne) + eri) =p, Using the (hirjs kin) 


expansions for quantities @;,;(€), the (hij, ki,j)-expansions for quantities Cini (€) and 


ir? 


the (hirj, Kinj)-expansions for quantities einj (€) given, respectively, in the above Steps 
5.3, 5.4 and 5.5, and the proposition (i) (the summation tule) of Lemma 10.5. In 


this case, parameters hi = = hirj A hirj A hirj, hig = — kin A kei A Kini. JE Y,, i€ -X. 


These asymptotic expansions are pivotal. 


ir? 
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5.7. To construct (hin, k ixj)-expansions for sums € j,;(€) = eg(€) + énj(€) = 
Pil) Pirle)  _ Pri(€) Pir(€) ; 

eg(€) + Cirle + &rO,O Hrn@ +iOTR,@ using ny, 

expansions for expectations e,(€) given in condition F, the (/j,;, kj,;)-expansions- 

expansions for quantities @;,;(¢) given in the above Step 5.6, and the proposition 

(ii) (the summation rule) of Lemma 10.3. In this case, parameters hj; = my A 


+ 
mi; )- 


ide iP = mi; A kinj, J € Y;, 
5.8. To construct (,m;, Mm; rm; )-expansions for probabilities ,ej(¢) = 


ie ,X. These asymptotic expansions are pivotal. 


ir? 


+ 
rns 
i= Mj 


tations e;(€) given in condition F, the (hing, kij)-expansions for quantities é;,;(¢) and 
(hin, kirj)-expansions for quantities ¢ ;,;(¢) given, respectively, in Steps 5.6 and 5.7, 
and the corresponding variants of formulas for eapeclanonb -e,(€) given in relation 
(103). In this case, parameters , My = = hin, +m; = = kinj if je Yt NY;,, ©€ -X, 


rbj[l] xe! + o(e"" ), i,j € ,X using the asymptotic expansions for expec- 


.. >. 
or ni =m;, rm; — mi; if j € Y; NY, ie ,X, or mn, = hirj, +m; = kj; if 
Jé€ Y;, NY;,, i€ -X. These asymptotic Ste are pivotal. 

oe 27 Y. 


5.9. The corresponding algorithm is a particular case of the algorithm given in 
Steps 5.1—5.8. In this case the non-absorption probability p,,(¢) = 1 — p;-(e) = 1 


and, thus, conditional probabilities p,(¢) = fans =pjl(e), J€ Yt and 
Pir(€) 


a7 Dir(€), i € -X. This permits one replace the (hj, kj)- 
expansions for conditional probabilities p,;(¢) by the (7 


quantities pi,-(€) = 


ie Ey I)- expansions for tran- 


ae ye bepees p,j(€) and the (hj, ij)- expansions for quantities p,;(¢) by the 
(I;,, -)-expansions for transition probabilities p;,(¢). These are the only changes in 
the aleorihni for construction of asymptotic expansions for expectations 
rej(€), i,j € -X given in Steps 5.1—-5.8, which are required. 

The above remarks can be summarized in the following theorem. 


Theorem 10.4 Conditions A-F assumed to hold for the semi-Markov process 
nt), also hold for the reduced semi-Markov process ,n©(t), for every r € X. 
The asymptotic expansions penetrating conditions B and F¥ are given for transition 
probabilities ,pj(€),j € +Yi, i € -X,r € X and expectations ,ej(€),j € -Yi, 1 € 
+X, r € Xin Algorithms 3 and 5. 


Algorithm 6. This is an algorithm for computing upper bounds for remainders in 
asymptotic expansions for expectations , ei (e),1 ije -X. 


6.1. To — construct (hir, k ins Sirs Gir, &ir)-expansions for quantities 
Pir(€) = gr = ee, ie ,X, using the (/;,, [7 , 6;, Gj, ej)-expansions for tran- 


sition probabilities p;,(€) given in condition B’, the (h,, i 5 ies Gir, &j-)-expansion for 
the non-absorption probability p,(¢) = | — p,,(€) givenin Algorithm 2, and proposi- 
tion (v) (the division rule) of Lemma 10.4. In this case, parameters Sirs Ge bir, LE pX 
are given by the corresponding variants of relation (17). 

6.2. To construct (hinj. & inj, Siri, Gi, Einj)- expansions for products Dirj(€) = Pir(é) 


5 ile) __ Pal) tees le a eas 
Pri(é) = eo pe? J Yin rE rX, using (hir, Kir, Sirs Gi, Eir)- 
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expansions given in the above Step 6.1, the (hij. kj, bi, Gj, &,;)-expansions for con- 
ditional probabilities p,;(¢) = ” a5 given in Step 4.1 of Algorithm 4, and the 
proposition (iii) (the multiplication rule) of Lemma 10.4. In this case, parameters 
Sirjs Gir, é ij, J € Y;,, i € -X are given by the corresponding variants of relation 
(15). 


6.3. To construct (hin, kinjs Sinjs Gay, é,j)-expansions for sac eirj(€) = ir (€) 


s ri (€) é . 
Pri(€) = éir(€) ae) > J € Agee LE rX, using the (m,;,, my, oir > Gir > Eir)- 


ir? 


expansions for expectations e;,(€) given in condition F’, the (hy, hij, bjs Gj, Exj)- 


expansions for conditional probabilities p,;(¢) = P ot given in Step 4.1 of 


Algorithm 4, and the proposition (iii) (the multiplication rule) of Lemma 10.4. In 


this case, parameters Sinjs Gin, Bini JE Y;, 
variants of relation (15). 


ieé ,X are given by the corresponding 


ir? 


6.4. To construct Chinj Kiris Sinis Gis Si)-€xpansions for products einj(€) — 


ir 7i (E) ° _ ° : 
err (€):Pii(€) = err (6) ES CE je€Y,, ie€,X, using the (m, 


5yr, Gyr, Err)-expansions for expectations e,,(¢) given in condition F’, the 
(hij, Kirj, dir, Giri, €irj)-eXpansions for quantities p;,;(e) given in the above Step 6.2, 
and the proposition (iii) (the multiplication rule) of Lemma 10.4. In this case, para- 


+ 
Myp> 


rr? 


meters bir ris Gir, Ein) j€Y;,, i€ -X are given by the corresponding variants of 
relation (15). . oo 

6.5. To construct (hirj, Kini bin Gir, éinj)-expansions for products einj(€) a 
e,j(€)-pir(é) = en(€) ES, j€Y;,, ie ,X, using the (rm, ? Mj» Bris Gr Ex) 
expansions for expectations e,;(€) given in condition F’, the (hir, Kir, ox G.. &i,)- 
expansions for quantities p;,(€) given in the above Step 6.1, and the proposition (iii) 


(the multiplication rule) of Lemma 10.4. In this case, parameters . i Gir, Bai). JE 
Y;,. i € -X are given by the corresponding variants of relation (15). 


6.6. To construct (hinjs eae expansions for sums @j;(€) = 


Pil) ir ( Pi) ir 
einrj (€) “+ fix: (€) 4 eig(€) = = eir(€) as -(€) + e(€) pO 7a + ej(€) pO > 


using the inj Kinis Siri Gini. Einj)-expansions for quantities Cinj(E), the (hij, kirj, 


by. ¢ Gir, i, )-expansions for quantities ee (e) and the (hin Hi Kini Bir Gig, Bai) 
expansions for quantities é;,;(€) given, respectively, in the above Steps 6.3, 6.4 and 
6.5, and the proposition (i) (the summation rule) of Lemma 10.6. In this case, parame- 
ters Sinj, € Gia js €irjs J © Yj,, 1 € -Xare given by the corresponding variants of relation 
(24). 

6.7. To construct (hirj: Kinjs Siri: Girj; €j,;)-expansions for sums €;,;(€) = 
ei(€) 10 = e4j(€) + eir(€) LER + eer (0) eB PERS + erj(©) EER, using 
ee Gi, &)-expansions for expectations e;(€) given in condition F’, the 


ir? 


(m,; 1; ’ ’ 

jo Mii 
(hin, i” i Siri Hs Gir j. €irj expansions for quantities é;,;(€) given in the above Step 6.6, 
and the proposition (ii) (the summation rule) of Lemma 10.4. In this case, parameters 
Sir rj Giri. € ip ij, J € Y;,, 1 € -X are given by the corresponding variants of relation 
(14). 
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6.8. To construct (-m;;, Mi; , Oi, Gi, ,€ij)-expansions expansions for expecta- 
mt 

tions -ej(€) = ‘eo ‘i — bill Jel + ole"), i,j € ,X, using the (m;, bij, Gi, &jj)- 
expansions for Senestalions e(€) given in condition F’, the inj. = - Sinjs Gin, § Einj)- 
expansions for quantities @;,;(¢) and the ( hiris K injs S iris Girj, € “inj “expansions for 
quantities @,;(€) given, respectively, in Steps 6.6 and 6.7, and the corresponding 
variants of formulas for expectations , -eij(€) given in relation (103). In this case, 
parameters oi = — Siri. Gy = = Gu, réy = ij if FE ee OY;,, i€ -X, or oy = = 
by, Gy = Gy, réy = ey ij € Yi NY;,, i€ -X, or dy = dun Gi = Ginj, r8ij = 
Evitye Y,. NY, ie ,X. 

Case 2: r € Y. 

6.9. The corresponding algorithm is a particular case of the algorithm given in 
Steps 5.1—5.8. In this case the non-absorption probability p,,(¢) = 1 — p,-(¢) = 1 


and, thus, conditional probabilities p,;(¢) = Pale) = pry(€), J € as and quantities 


1—pyr(€) 
Pir(€) = io > Dirl(€é), i€ -X. This permits one replace the 


ij? mi; 


ir? 


ir? 


(hij, kris bri, Gry. B)-expansions for conditional probabilities p,j(¢) by the 
Ci Wij» Bris Gris &r)- expansions for transition probabilities p,j(¢) and the 


(hy, k. tis by. Gi, ey) expansions for quantities Prj(€) by the (/,,, I, Sir, Gir, Eir)- 
expansions for transition probabilities p;,(¢). These are the only changes in the algo- 
rithm for construction of asymptotic expansions for expectations ,e,(€), i,j € -X 
given in Steps 6.1—6.8, which are required. 


The above remarks can be summarized in the following theorem. 


Theorem 10.5 Conditions A, B', C-E, F’ assumed to hold for the semi-Markov 
process n©)(t), also hold for the reduced semi-Markov process ,n©(t), for every r € 
X. The upper bounds for remainders in expansions penetrating conditions B' and ¥" 
are given for transition probabilities ,pi(€),j € +Xi, i€ +X, r € Xand expectations 
reg(€),j € -Xi, 1€ -X,r € Xin Algorithms 4 and 6. 


5 Sequential Reduction of Phase Space 
for Semi-Markov Processes 


In this section, we present algorithms of sequential reduction of phase spaces for 
semi-Markov processes. 
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5.1 Algorithms of Sequential Reduction of Phase Spaces 
for Semi-Markov Processes 


Let n® (t) be a semi-Markov process with the phase space X = {1,..., N}, which 
satisfy conditions A-F. 


Let (r1,..., 7) is a permutation of the sequence (1,...,N), and7, = ("1,..., 
rn), n= 1,...,N is the corresponding sequence of growing chains of states from 
space X. 

Let us choose state i € X, and a permutation (r,..., 7y) such that ry = i. 

EA us also assume that initial distribution p(e¢) is concentrated in the state i, 1.e., 
p= 1. 


Algorithm 7. This is an algorithm for sequential reduction of the phase space 
for the semi-Markov process n®)(f) and constructing asymptotic expansions for 
transition probabilities and expectation of sojourn times for semi-Markov processes 
with reduced phase spaces. 

7.1. Let ;,n(t) = »,n©(t) be the reduced semi-Markov process which is the 
result of reduction of state r; for the semi-Markov process n)(t). This semi- 
Markov process has the phase space ;,X = X \ {71}, transition probabilities of the 
embedded Markov chain ;,p;;(€), i’, ’ € +,X and expectations of transition times 
7,e7;(€), 7, j’ € »,X, which are determined by the transition probabilities and the 
expectations of transition times for the process 7) (t) via relations (87) and (96). 
According Theorem 10.1, the expectations of hitting times Ey;(e), 1’, j’ € 7X coin- 
cide for the semi-Markov processes 7) (t) and 71 n©(t). According Theorems 10.2 
and 10.4, the semi-Markov process 7n© (t) satisfy conditions A-F. The transi- 
tion sets ;, Yv = ,,Y;, i’ € ;,X are determined for the process ;, n © (t) by condition 
A and relation (90). Therefore, the (;, lini 7 I;,,)-expansions for transition proba- 
bilities ;, pi; (e),j’ € + Yr, i € 7X and G, Mi F mj}, )-expansions for expectations 
ner (€), J’ € + Yi, i € -X can be constructed by applying Algorithms 1, 3 and 
5 to the (i, I,,)-expansions for transition probabilities pj; (e), j’ € Yi, i’ € X and 
(mj, mj,,)-expansions for expectations ej(€), j’ € Yi, i’ € X. These expansions are 
pivotal. 

7.2. Let 7, n® (t) be the reduced semi-Markov process which is the result of reduc- 
tion of state r2 for the semi-Markov process 7n© (t). This semi-Markov process 
has the phase space ;,X = X \ {r1, r2}, the transition probabilities of the embed- 
ded Markov chain ;,py;(¢), i,j’ €,, X and the expectations of transition times 
rein (€), i,j’ € +X, which are determined by the transition probabilities and the 
expectations of transition times for the process ;, n®)(t) via relations (87) and (96). 
According to Theorem 10.1, the expectations of hitting times Ej; (e), 7’, j’ € 7X 
coincide for the semi-Markov processes y(t), ;,7(t) and ;,n©(¢). According 
Theorems 10.2 and 10.4, the transition probabilities of the embedded Markov chain 
mpi (€), i,j’ € ~X and the expectations of transition times ;,e7;(€), i,j’ € _X 
satisfy conditions A-F. The transition sets ;,Y;, i’ € ;,X are determined for the 
process nn (t) by condition A and relation (90) in the same way as the transi- 
tion sets ;, Y;, i’ € ,,X are determined by condition A and relation (90) for the 
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process 7n©(t). Therefore, the Gali iis Pel ij ye -expansions for transition probabil- 
ities ;,py(€), 7 € Yr, € 7X ad ‘Gm: ijn Pe miy)- expansions for expectations 
nen (€),J € Yi, i € 7,Xcan be constructed by applying Hleor ems 1,3 and 5 to 
the G, J, i? ni = oe for transition probabilities ;, pj (e), 7’ € +, Yr, € 7X 
and (;,m iN 7M; )-expansions for expectations ;,e;;(€), j’ € +, Yi, i € ;,X. These 
expansions ara pivot 

7.3. By continuing the above procedure of phase space reduction for states 
r3,..., /N—1, we construct the semi-Markov process ;,,_, n © (t) with the phase space 
ty = X\{ri,,---,~w-1} = {i} (which is a one-point set), the transition prob- 
abilities of the embedded Markov chain ;,_, pji(€) = 1, and the expectations of tran- 
sition times ;,_,e(€), which are determined by the transition probabilities and 
the expectations of transition times of the process iy (t) via relations (87) 
and (96). According to Theorem 10.1, the expectations of hitting times E;;(¢) for 
the semi-Markov processes 7(1), 7,7 (0),---, #y.,.n©(t) coincide. According 
to Theorems 10.2 and 10.4, the transition probabilities of the embedded Markov 
chain ;,_,pii(€) = 1 and the expectations of ge times ;, _,ei(€) satisfy con- 
ditions A-F. In this rah the transition set ;,_,Y; = {i}, for every i ¢ X. There- 
fore, the (iy_jlijs ty li; 7, )-expansions for transition peebabilitiess iy_ Di (€) = 1,’ € 


vist! € wy X (which take the form of relation (58)) and (F_,mjj. Ay My) 
expansions for expectations ;,_,é;7(€), j’ € = aYy 1,1 € 7y_,X can be constructed by 
applying Algorithms 1, 3 and 5 to the (jy sli. oli iy) -expansions for transition 
probabilities ;, ,prj(e), j’ € mY, i € %_,X and G_, Mijn His mj,)-expansions 
for expectations ;,_,e7(€),j’ © m_,Yi. l! € 7_,X. These expansions ara pivotal. 

7.4. The semi-Markov process ;,_,7©(t) has the one-point phase space ;,_,X = 
{i} and, thus, the transition probability ;,_,pi(¢) = 1, while the expectation of tran- 
sition time ;,_,e;(€) = E,(¢). The above algorithm of sequential reduction of phase 
space should be repeated for every i € X. In this way, the Laurent asymptotic expan- 
sions for quantities E;;(€), i € X can be written down. These asymptotic expansions 
have the following form, 


Mt 


Ej(e) = >> Bille! + die"), i € X, (104) 
I=Mj 


where Panic Mj, = iM; 
M,,,...,M* 


u li? 


4,1 € X and the coefficients Bill] = 7, dill], 7 = 
i € X, where ;,_,b,[/] are coefficients in the corresponding (;,_,mj, ir? 
y_,Mj;,)-expansions for expectations 7, _,e7(€),j’ © ty_,Yr, i € _,X. These 
expansions are pivotal. 

It should be noted that, for every n = 1,...,N — 1, the reduced semi-Markov 
process ;,7(t) is invariant with respect to any permutation 7, = (rj, ..., 1.) of the 
the sequence 7, = (71,..-,n)- 

Indeed, for every such permutation 7), = (r},..., 1,), the corresponding reduced 
semi-Markov process an (t) is constructed from the initial semi-Markov process 
n(t), as the sequence of its states at sequential moment of hitting into the same 


am Th 
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reduced phase space 7X = X \ (ri... }= 7,X%=X\{rn,...,7,} and times 
between sequential jumps of the reduced semi-Markov process an (t) which are 
times between sequential hitting of the above reduced space by the initial semi- 
Markov process y(t). 

This implies that the expectation of transition time ;,e;;(€) is, for every i’, j’ € 
7, andn = 1,...,N — 1, invariant with respect to any permutation 7, = (rj,..., 
r,) of the sequence 7, = (r,..-5 Tn). 

Moreover, as follows from the Algorithms 1-7, the expectation of transition time 
7, ei; (€) is arational function of initial transition probabilities pj (e), 7 € Yi, i € Xand 
expectations e;(€), j € Y;, i € X (a quotient of two sums of products of some of these 
probabilities and expectations), which, according the above remarks, is invariant with 
respect to any permutation 7/, = (r}, ...,1,) of the sequence 7, = (r1,..., Tn). 

By using identical arithmetical transformations (disclosure of brackets, imposition 
of acommon factor out of the brackets, bringing a fractional expression to acommon 
denominator, permutation of summands or multipliers, elimination of expression 
with equal absolute values and opposite signs in the sums and elimination of equal 
expressions in the quotients, etc.) the rational function } e7;(€) given by Algorithm 
7 can be transformed in the rational function ;, e;(€) given by Algorithm 7 and vice 
versa. 

By Lemma 10.8, these transformations do not affect the corresponding asymptotic 
expansions for expectation ;, é;7; (€) given by Algorithm 7, and, thus, these asymptotic 
expansions are invariant with respect to any permutation 7, = (rj,...,7/,) of the 
sequence /, = (r1,.--,1n)- 

The above remarks can be summarized in the following theorem. 


Theorem 10.6 Let conditions A-F hold for semi-Markov processes n(t). Then, 
for every i € X, the Laurent asymptotic expansion (104) for the expectation of hitting 
times Ej;(€) given by Algorithm 7 can be written down. This expansion is invariant 
with respect to the choice of permutation (r|,...,1N—1, 1) of sequence (1,...,N), 
in the above algorithm. 


Let us now assume that conditions A, B’, C-E, F’ hold for the semi-Markov 
process n©)(t). 

Algorithm 8. This is an algorithm for computing upper bounds for remainders in 
asymptotic expansions for transition probabilities and expectation of sojourn times 
for semi-Markov processes with reduced phase spaces. 

8.1. Let ano} = nn®(t) be be the reduced semi-Markov process, which is 
constructed as this is described in Step 7.1 of Algorithm 7. According to Theo- 
rems 10.3 and 10.5, the semi-Markov process ;, n © (t) satisfies conditions A, B’, C-E, 
F’. Therefore, (;, lini al Eis 7,577", Gj’, 7,€r')-expansions for transition probabili- 


. z * = + PS r . . 

ties mdi (€), 7 e 7, Yi, ve 7X and (i Myy, rIMNjj> A ory’, ry Gij', 71 &7')-expansions 
for expectations ;,ey;(€),j’ € 7, Yi, i’ € 7X can be constructed by applying Algo- 
rithms 1-5 to the Cis Tis dij, Gi, €vj')-expansions for transition probabilities 
pi (e),j’ € Yr, €X and (my, mt.,, dij, Gi, &vj)-eXpansions for expectations 


tj 
en (€),j' € Y;, i eX. 
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8.2. Let ~n© (t) be the reduced semi-Markov process, which is constructed 
as this is described in Step 7.2 of Algorithm 7. According to Theorems 10.3 
and 10.5, the semi-Markov process 2n©(t) satisfies conditions A, B’, C-E, F’. 
Therefore, Gli ts lip Oni, % Gry, rEry)- pagan for transition probabilities 
mb (€), 7’ € »Yr, i € 7,X and Gm Mins 7 Mi, dis Gi, 7 €i;’)-expansions for 
expectations ;,e;;(€), j’ € 7, Yi, i’ € 7,X can be constructed by applying Algorithms 
1-5 to the (,J;, i Fi hip 7, Or jr, % Giz, 7 Eirj")- Paetead for transition probabili- 
ties py (e),j’ € Yi, i € ;,X and Mis 7 Mi, Fi 
expectations ej(e), j’ € 7, Yr, i’ € 7X. 

8.3. Finally, let ;,_, n©(t) be the reduced semi-Markov process, which is con- 
structed as this is described in Step 7.2 of Algorithm 7. According to Theo- 
rems 10.3 and 10.5, the semi-Markov process ;,_,7©)(f) satisfies conditions A, B’, 


v a5 
C-E, F’. Therefore, (jy lis iy alas Fy ah 7y_1Gitj’s #y_, €rj')-eXpansions for tran- 


bi, A Gij, 7, €y;’)-expansions for 


sition probabilities ;,_ pi (€), 7’ © iy. Visi © iy ~Xand Gy_ My, Fy Mie Ti bi;'s 
iy Gifs 7y_ €i’)-expansions for expectations enue J € pyr, 0 ‘ 7y_) X can 
be constructed by applying Algorithms 1-5 to the (,_,lj;, in alas 7-2 8 ij’ Fy_2 Gij’s 
7y_2€ij')-expansions for transition probabilities pj (e),j’ € Yy,i € 7X and 
Go 2M its Mi, Fy bi 4 a Gi » ?y_»€i')-expansions for expectations ej; (€), j’ € 
fyo Vis i € Fuca 

8.4. Finally, due to equalities ;,_,e(€) = Ej(e), i € X, we get that the asymp- 
totic expansion (104) for expectations E;;(¢), i € X, given in the Step 7.4 of Algo- 
rithm 7, is a (M7 Mj, 5%, Gj, €;)-expansion with parameters Mj; = jy_,™j; MS 
ii» ji = Fy Otis Gj = Fy Gis 8) = Fy Eii- 

In this case, the invariance of explicit upper bounds for remainders given by Algo- 
rithm 8, with respect to the choice of any permutation (1, ..., “y—1, 7) of sequence 
(1,..., 4), can not be guaranteed. 

However, Lemma 10.9 guarantees that the following inequalities hold for the 
parameters 55,7 € X, 


5, = 8° = min. (5 A bi). (105) 
JEYi,ieX 


The following theorem takes place. 


Theorem 10.7 Let conditions A, B’, CE, F’ hold for semi-Markov processes n®(t). 
Then, for every i € X, the (Mj; , Mj} )-expansion (104) for the expectations of hitting 


ii? 
€;,)-expansion, with para- 


times E;;(€), given by Algorithm 7, is a (M;; , M;;, 6°, G°, 
meters 5;,, G5,, €;, given in Algorithm 8. The inequality (105) holds for parameters 


TE? i?’ 


6°,ie X. 


mu? 
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5.2. Laurent Asymptotic Expansions for Expectations 
of Hitting Times 


Algorithms presented above yields the Laurent asymptotic expansions for expec- 
tations of hitting times Ej(¢), i, 7 € X. Indeed, let choose two states i,j € X and a 
chain of states ry_2 = (1, ..-, "N—2), 11,---,’n-2 FA, J. 

According to Theorem 10.1 and Algorithm 8 the expectations Ej(€) coincides 
for the initial semi-Markov process 7°) (t) and the semi-Markov process ;,_,7°)(t). 
The semi-Markov process ;,_,7)(¢) has a two-points phase space ;,_,X = {i,j}. 
The expectations of hitting times E;;(e), i’ € {i,j} can be found by solving, for 
every j’ € {i, j}, the system of (two, in this case) linear equations (77) that yields the 
following formulas, for every j’ € {i, j}, 


1 
Ev; (€) = py2er(€) - abe” 
iy Dy (©) (106) 
Ejqy (€) = iy 2e7(€) + iy nr (©) 2 > 
where i’ 4 j’ in both equations in (106) and, 
iy_2@r (E) = iy_pevi(E) + my_p,erj(€), @ € {i,j}. (107) 


The corresponding asymptotic expansions for Ej(¢) can be constructing by 
using the asymptotic expansions for transition probabilities pj (¢) and expectations 
7y_ i (€) given in Algorithms 7 and 8 and the operational rules for Laurent asymptotic 
expansions presented in Lemmas 10.1—10.9. 


6 Asymptotic Expansions for Stationary Distributions 


In this section, we present algorithms for construction of asymptotic expansions for 
stationary distributions of nonlinearly perturbed semi-Markov processes. 


6.1 Asymptotic Expansions for Stationary Probabilities 
of Perturbed Semi-Markov Processes 


Let us recall relation (78) for stationary probabilities of the semi-Markov process 
yO), 
e;(€) 


E,(e)’ ie X. (108) 


n(€) = 
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Algorithm 9. This is an algorithm for constructing asymptotic expansions for 
stationary probabilities of perturbed semi-Markov processes. 

9.1. Conditions A-F and proposition (i) (the multiple summation rule) of 
Lemma 10.5, permits one can construct (m; , m;')-expansions for expectations 
e;(e), 1 € X, which take the following forms, 


e(€) = > ey(€) 


Jey; 
= 3 (Sone + aye) 
Jey; l=m; 
= 3 billle! + oe"), 1X, (109) 
l=m;, 
where 
m, =minm,;, m; =minm:, i € X, (110) 
jeY; 7 jeY; 4% 
and 
iim; +1 = >°diylm; +0, 1=0,...,m> —m;, ie X, (111) 
JE Y; 


where b,[m; +1] = 0, forO <1 < mM; —m,,j€ Y;, i€X. 

The above asymptotic expansions are pivotal for all i € X. 

9.2. Conditions A-F, relation (108) and proposition (v) (the division rule) of 
Lemma 10.3, permits us construct (7; , n; )-expansions for stationary probabilities 
;(€), i € X, which take the following forms, 


+ 
nj 


mile) = > cifle! + o(e” ), 1 X, (112) 


l=n; 


n, =m, —M;, ni =(m; —M;) A\(m, +M} —2M;),i€X, (113) 


and 


- blm; +1] — Dep) BilMy + Nedny + 1-07 
cin, +1 = Bal] ‘ 


u 


ieX. (114) 


+ = 
1=0,...,n; —n;, 
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Since 77;(¢) > 0,1 € X, € € (0, €0], the asymptotic expansions (109) are pivotal, 
1.e., coefficients, 
ciln; |] = bjlm; |/BilM;,] > 0, i € X. (115) 


By the definition, e;(¢) < Ej(e), i € X, € € (0, €o]. This implies that parameters 
M,; <™m, ,i € X and thus, parameters 


n, >0, iE X. (116) 
Moreover, since Diex mj(€) = 1, for every € € (0, €0], the parameters n>, i € X 

and coefficients c;[/],/ =n; ,..., ie i € X satisfies the following relations, 
n =minn, =0, (117) 

ieX 
and 
1 for/=0, 

c= > lef = iF fon 21 at Soe (118) 


ieX 


Let us introduce sets, 
Xo = {fie Xin; =O}. 


By the above remarks, the following relation takes place, 


i[0] >0 ifie Xo, 


0 ifi¢ Xo. ol 


m;(0) = tim m(€) = 7 


Theorem 10.8 Let conditions A-F hold for semi-Markov processes n©(t). Then, 
the (n; ,n;')-expansions (112), for the stationary probabilities m;(e),i € X given 
by Algorithm 9, can be written down. This expansion is invariant with respect to 
the choice of permutation (r,,...,N—1,1) of sequence (1,...,N), in the above 
algorithm. Relations (115)—(119) hold for these expansions. 


6.2 Asymptotic Expansions for Stationary Probabilities 
of Perturbed Semi-Markov Processes with Explicit Upper 
Bounds for Remainders 


Algorithm 10. This is an algorithm for computing upper bounds for remainders 
in asymptotic expansions for stationary probabilities of perturbed semi-Markov 
processes. 

10.1. Conditions A, B’, C—E, F’ and the proposition (i) (the multiple summation 
rule) of Lemma 10.6 imply that the (7; , m;' )-expansions for expectations e;(€), i € 
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X are (m; , m; , j;, Gi &;)-expansions, with parameters jj, CG; &;,i € X given by the 
following formulas, 


6= min dy, 
JeYi, m=m* 


i 


+418 +_3 ts oe 
: : mj +4j—m; —6$; «j—m; —); 
G; = > (Gi + > [bylé; : ; 
+ 


ic Y; + et 
Jey; m; <j<mj 


& = min éy. (120) 
jevi 
10.2. Conditions A, B’, C-E, F’ and the propositions (iv) (the reciprocal rule) 
and (v) (the division rule) of Lemma 10.6 imply that the (n; , n;' )-expansions 
for expectations z;(¢), i € X are (n; , oe 67, G¥, €*)-expansions, with parameters 
6*, G7, e*, i € X given by the following formulas, 


ji ifn} =m) —M, <n; +M} —2M;,, 
5 = 1 6,63 ifn’ =m} —M, <n; +Myz —2M;,, 
OB ifn; =n; + M; — 2M, <m; —M,,, 


= -1 
G = A) » [Dil] (6) Ma 7 


+ - + - + 
m; A(m, +Mj; —M;, )<l<m, 


' > [bill llkcilk]|(ef yi M8 


m A(m, +Mj —M;)<l+k.m> <l<m; Mz <k<M} 


db G (eye hee 


k+M++6°—nt—M; —6* 
+4 0 leider i Mi ‘) 


ne <k<nz 


BilMj | ol-Me— 
ay iat |Bill]|(e; » a 


: +1130 M1 ee 4 

ee =e ABP AY + Ge(eo)Mi hep Malye ifM;, < M;;, 
BiilM;; 1, 37 : Se 

(ae) i ifM;, =M;; . 


(121) 


Theorem 10.9 Let conditions A, B', C-E, F’ hold for semi-Markov processes 
n©(t). Then, the (n; , ny 5%, G¥, &*)-expansions (112) for the stationary probabili- 
ties m;(€), i € X given by Algorithms 9 and 10 can be written down. The inequalities 
6 > 5°,i € X hold, where parameter 5° is given in relation (105). 
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7 Future Studies and Bibliographical Remarks 


In this section, we present some directions for future studies and short bibliographical 
remarks concerned works in the area. 


7.1 Directions for Future Studies 


The method of sequential reduction of a phase space presented in the paper can also 
be applied for getting asymptotic expansions for high order power and exponential 
moments of hitting times, for nonlinearly perturbed semi-Markov processes. 

In the present paper, we consider the model, where the pre-limiting perturbed 
semi-Markov processes have a phase space which is one class of communicative 
states, while the limiting unperturbed semi-Markov process has a phase space which 
consists of one or several classes of communicative states and possibly a class of 
transient states. However, the method of sequential reduction of the phase space can 
also be applied to nonlinearly perturbed semi-Markov processes with absorption and, 
therefore, to the model, where the pre-limiting semi-Markov processes also have a 
phase space, which consists of several classes of communicative states and a class 
of transient states. 

We are quite sure that combination of results in the above two directions with 
the methods of asymptotic analysis for nonlinearly perturbed regenerative processes 
developed in Silvestrov [301, 304, 305] and Gyllenberg and Silvestrov [99, 100, 
102, 104] will make it possible to expand results concerned asymptotic expansions 
for quasi-stationary distributions and other characteristics for nonlinearly perturbed 
semi-Markov processes with absorption, where the limiting semi-Markov process 
has a phase space which consists of one class of communicative states and a class 
of transient states, to a general case, where the limiting semi-Markov process has a 
phase space, which consists of several classes of communicative states and a class 
of transient states. Some additional results and examples can be found in the recent 
paper by Silvestrov, D. and Silvestrov, S. [321]. 

The problems of aggregation of steps in the time-space screening procedures 
for semi-Markov processes, tracing pivotal orders for different groups of states as 
well as getting explicit matrix formulas, for coefficients and parameters of upper 
bounds for remainders in the corresponding asymptotic expansions for stationary 
distributions and moments of hitting times, do require additional studies. It can be 
expected that such formulas can be obtained, for example, for birth-death type semi- 
Markov processes, for which the proposed algorithms of reduction of a phase space 
preserve the birth-death structure for reduced semi-Markov processes. Some initial 
results in this direction are presented in the recent paper by Silvestrov, Petersson and 
Hossjer [319]. 

We are going to present results concerned Laurent asymptotic expansions for 
power and exponential moments of hitting times, quasi-stationary distributions and 
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explicit formulas for coefficients and parameters of upper bounds for remainders 
for some specific classes of semi-Markov models, as well as applications to some 
models of population genetics, information networks and queuing systems, in future 
publications. 


7.2 Bibliographical Remarks 


Note first of all that the model of perturbed discrete time Markov chains, at least, in the 
most difficult case of so-called singularly perturbed Markov chains and semi-Markov 
processes with absorption and asymptotically uncoupled phase spaces, attracted 
attention of researchers in the mid of the 20th century. 

The methods used for construction of asymptotic expansions for stationary dis- 
tributions and related functionals such as moments of hitting times can be split in 
three groups. 

(1) The first works related to asymptotical problems for the above models are 
Meshalkin [221], Simon and Ando [323], Hanen [106-109], Kingman [169], Dar- 
roch and Seneta [65, 66], Keilson [160, 161], Seneta [273-276], Schweitzer [265], 
Korolyuk [177], Silvestrov [287-293], Anisimov [11-15], Korolyuk and Turbin 
[193, 194], Gusak and Korolyuk [96], Turbin [344, 345], Korolyuk, Penev and 
Turbin [189], Kovalenko [199, 200], PoliS¢uk and Turbin [256], Korolyuk, Brodi 
and Turbin [179], Pervozvanskii and Smirnov [247], Courtois [56] and Gaitsgori 
and Pervozvanskii [88]. 

(2) Convergence results, for distributions and moments of hitting times, eigenval- 
ues, eigenvectors, stationary and quasi-stationary distributions, Perron roots, coef- 
ficients of ergodicity, etc. have been studied in works by Meshalkin [221], Hanen, 
[106-109], Kingman [169], Darroch and Seneta [65, 66], Keilson [160, 161], Seneta 
[273-276, 282], Schweitzer [265, 266], Korolyuk [177, 178], Silvestrov [287—289, 
291, 293-297, 300, 303, 309], Anisimov [11-19], Korolyuk and Turbin [193-196], 
Gusak and Korolyuk [96], Turbin [344], Korolyuk, Penev and Turbin [189], Masol 
and Silvestrov [222], Zakusilo [360, 361], Kovalenko [199, 200], Korolyuk, Brodi 
and Turbin [179], Gaitsgori and Pervozvanskiy [88, 89], Allen, Anderssen and Seneta 
[8], Kaplan [146, 147], Korolyuk, Turbin and Tomusjak [197], Shurenkov [285, 286], 
Anisimov and Chernyak [20], Anisimov, Voina and Lebedev [21], Coderch, Willsky, 
Sastry and Castafion [53], Korolyuk, D. [174], Korolyuk, D. and Silvestrov [175, 
176], Stewart [329], Koury, McAllister and Stewart [198], McAllister, Stewart and 
Stewart, W. [220], Cao and Stewart [51], Kartashov [152, 153, 155], Korolyuk and 
Tadzhiev [192], Burnley [49], Gibson and Seneta [90], Haviv [113], Haviv, Ritov 
and Rothblum [121], Rohlichek [259], Rohlicek and Willsky [260, 261], Silvestrov 
and Velikii [322], Alimov and Shurenkov [6, 7], Hunter [138], Latouche [208], 
Pollett and Stewart [257], Motsa and Silvestrov [223], Hoppensteadt, Salehi and 
Skorokhod [127], Kalashnikov [143], Korolyuk and Limnios [181, 187], Marek and 
Mayer [216], Yin and Zhang [354-357], Craven [64], Hernandez-Lerma and Lasserre 
[125], Yin, Zhang and Badowski [358], Silvestrov and Drozdenko [313, 314], Kupsa 
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and Lacroix [204], Drozdenko [70-72], Barbour and Pollett [38, 39], Glynn [91], 
Benois, Landim and Mourragui [42], Serlet [283] and Meyer [227]. 

(3) Rates of convergence, errors of approximation, sensitivity and related stability 
theorems for Markov chains and related models of stochastic processes have been 
studied in works by Schweitzer [265, 267-269], Silvestrov [287, 290, 292], Seneta 
[276-282], Courtois [56, 58], Gaitsgori and Pervozvanskiy [88, 89], Kovalenko 
[200], Kalashnikov [142—144], Latouche and Louchard [209] (1978), Berman and 
Plemmons [43], Meyer [225, 228], Kalashnikov and Anichkin [145], Bobrova [47], 
Louchard and Latouche [214, 215], Stewart [327—329, 331-337], Courtois and 
Semal [60-63], Haviv and Rothblum [123], Haviv and Van der Heyden [124], Koury, 
McAllister and Stewart [198], McAllister, Stewart, G. and Stewart, W. [220], Funder- 
lic and Meyer [87], Kartashov [148—150, 152-155, 157], Vantilborgh [347], Haviv 
[112, 115, 117], Haviv and Ritov [118, 120], Rohlichek [259], Rohlicek and Will- 
sky [260, 261], Stewart and Sun [339], Hunter [137—139, 141], Stewart and Zhang 
[340], Hassin and Haviv [111], Barlow [40], Meyn and Tweedie [230], Lasserre 
[206], Pollett and Stewart [257], Stewart, G., Stewart, W. and McAllister [338], 
Borovkov [48], Yin and Zhang [354-357], Li, Yin, G., Yin, K. and Zhang [213], 
Craven [64], Kontoyiannis and Meyn [173], Mitrophanov [23 1-234], Zhang and Yin 
[362], Mitrophanov, Lomsadze and Borodovsky [235], Guo [95] and Sirl, Zhang and 
Pollett [324]. 

(4) Asymptotic expansions for distributions of hitting times, moments of hitting 
times, resolvents, eigenvalues, eigenvectors, stationary and quasi-stationary distri- 
butions, Perron roots, etc., have been studied in works by Turbin [345], PoliS¢uk 
and Turbin [256], Koroljuk, Brodi and Turbin [179], Pervozvanskii and Smirnov 
[247], Courtois and Louchard [59], Korolyuk and Turbin [195, 196], Courtois [57], 
Latouche and Louchard [209], Kokotovié, Phillips and Javid [170], Korolyuk, Penev 
and Turbin [190], Phillips and Kokotovié [253], Delebecque [67], Abadov [1], Kar- 
tashov [151, 155], Haviv [112], Korolyuk [178], Stewart and Sun [339], Silvestrov 
and Abadov [311, 312], Haviv, Ritov and Rothblum [122], Haviv and Ritov [119], 
Schweitzer and Stewart [272], (1993), Silvestrov [301, 304, 305], Englund and Sil- 
vestrov [77], Gyllenberg and Silvestrov [99, 100, 102, 104], Korolyuk and Limnios 
[181-187], Stewart [335, 336], Yin and Zhang [354, 356, 357], Avrachenkov [26, 
27], Avrachenkov and Lasserre [34], Korolyuk, V.S. and Korolyuk, V.V. [180], 
Englund [75, 76], Yin, G., Zhang, Yang and Yin, K. [359], Avrachenkov and Haviv 
[31, 32], Craven [64], Avrachenkov, Filar and Howlett [30], Petersson [248-252], 
Silvestrov, D. and Silvestrov, S. [320, 321] and Silvestrov, Petersson and Héssjer 
[319]. 

(5) Asymptotic expansions for other characteristics of Markov type processes 
are presented in works by Nagaev [236, 237], Leadbetter [211], PoliS¢uk and Turbin 
[256], Quadrat [258], Abadov [1], Silvestrov and Abadov [310-312], Stewart and Sun 
[339], Kartashov [155, 158], Khasminskii, Yin and Zhang [165, 166], Wentzell [350, 
351], Cao [50], Gyllenberg and Silvestrov, [99, 100, 102, 104], Fuh and Lai [86], 
Kontoyiannis and Meyn [173], Fuh [84, 85], Samoilenko [263, 264], Silvestrov [301, 
304, 305], Ni [238-242]. Ni, Silvestrov and Malyarenko [243], Albeverio, Koroliuk 
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and Samoilenko [3], (2009) and Avrachenkov, Filar and Howlett [30], Petersson 
[248] and Silvestrov and Petersson [318]. 

(6) We would like especially to mention books including problems on perturbed 
Markov chains, semi-Markov processes and related problems. These are Seneta [276, 
282], Silvestrov [293], Korolyuk and Turbin [194, 195], Courtois [57], Kalashnikov 
[142, 144], Anisimov [18, 19], Stewart and Sun [339], Korolyuk and Swishchuk 
[191], Meyn and Tweedie [230], Kartashov [155], Borovkov [48], Stewart [335, 
336], Yin and Zhang [355-357], Korolyuk, V.S. and Korolyuk, V.V. [180], Bini, 
Latouche and Meini [46], Koroliuk and Limnios [187], Gyllenberg and Silvestrov 
[104] and Avrachenkov, Filar and Howlett [30]. 

(7) General results of perturbation theory of matrices and linear operators are 
presented in works by Vishik and Lyusternik [349], Wilkinson [353], Stewart [325- 
327, 330, 335, 336], Plotkin and Turbin [254, 255], Korolyuk and Turbin [195, 196], 
Berman and Plemmons [43], Wentzell and Freidlin [352], Haviv [114], Meyer and 
Stewart [229], Bielecki and Stettner [44], Delebecque [68], Stewart and Sun [339], 
Hunter [139, 141], Haviv and Ritov [120], Lasserre [206], Kartashov [155], Hoppen- 
steadt, Salehi and Skorokhod [129], Avrachenkov [26], Korolyuk, V.S. and Korolyuk, 
V.V. [180], Li and Stewart [212], Avrachenkov, Haviv and Howlett [33], Howlett 
and Avrachenkov [134], Howlett, Pearce and Torokhti [136], Torokhti, Howlett and 
Pearce [343], Verhulst [348], Howlett, Avrachenkov, Pearce and Ejov [135], Howlett, 
Albrecht and Pearce [133], Albrecht, Howlett and Pearce [4, 5] and Avrachenkov and 
Lasserre [35]. In particular, we would like to mention some books, which contains 
materials on general perturbation matrix and operator theory. These are Erdélyi [81], 
Kato [159], Cole [54], Korolyuk and Turbin [195, 196], Wentzell and Freidlin [352], 
Kevorkian and Cole [163, 164], Baumgartel [41], Stewart [335, 336], Korolyuk, V.S. 
and Korolyuk, V.V. [180], Konstantinov, Gu, Mehrmann and Petkov [171], Verhulst 
[348], Gyllenberg and Silvestrov [104] and Avrachenkoy, Filar and Howlett [30]. 

(8) Applications of results on perturbed Markov type processes to the control the- 
ory, decision processes, Internet, queuing theory, mathematical genetics, population 
dynamics and epidemic models, insurance and financial mathematics are presented in 
works by Simon and Ando [323], Kovalenko [200, 201], Courtois [57], Kalashnikov 
[142, 144], Delebecque and Quadrat [69], Kovalenko and Kuznetsov [202], Quadrat 
[258], Gut and Holst [97], Schweitzer [266], Anisimov, Zakusilo and Donchenko 
[22], Latouche [207], Pervozvanskii and Gaitsgori [246], Meyer [224], Ho and Cao 
[126], Asmussen [23, 24], Gyllenberg and Silvestrov [98, 101, 103, 104], Pollett 
and Stewart [257], Abbad and Filar [2], Hoppensteadt, Salehi and Skorokhod [128], 
Kijima [167], Kovalenko, Kuznetsov, and Pegg [203], Borovkov [48], Yin and Zhang 
[354-357], Englund [73, 74], Yin, G., Zhang, Yang and Yin, K. [359], Avrachenkov, 
Filar and Haviv [29], Altman, Avrachenkov and Ntjfiez-Queija [9], Langville and 
Meyer [205], Silvestrov and Drozdenko [314], Avrachenkov, Litvak, and Son Pham 
(36, 37], Drozdenko [70, 72], Andersson and Silvestrov, S. [10], Anisimov [19], 
Konstantinov and Petkov [172], Avrachenkov, Borkar and Nemirovsky [28], Bar- 
bour and Pollett [38, 39], Blanchet and Zwart [45], Hossjer [130, 131], Engstrom 
and Silvestrov, S. [78-80], Héssjer and Ryman [132], Ni [242], Petersson [249, 252], 
Silvestrov [306-308] and Silvestrov, Petersson and Héssjer [319]. In particular, we 
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would like to mention some books in this area that are Kovalenko [200], Kalashnikov 
[142, 144], Anisimov, Zakusilo and Donchenko [22], Pervozvanskii and Gaitsgori 
[246], Kijima [167], Kovalenko, Kuznetsov, and Pegg [203], Asmussen [24], Anisi- 
mov [19], Gyllenberg and Silvestrov [104], Koroliuk and Limnios [188], Asmussen 
and Albrecher [25], Avrachenkov, Filar and Howlett [30], Silvestrov [307, 308]. 

(9) Exact and related approximative computational methods for stationary and 
quasi-stationary distributions of Markov chains and semi-Markov processes and 
related problems are presented in works by Romanovskii [262], Feller [83], Kemeny 
and Snell [162], Golub and Seneta [92], Seneta [276, 282], Paige, Styan and Wachter 
[245], Silvestrov [298, 299, 302], Chatelin and Miranker [52], Harrod and Plemmons 
[110], Schweitzer [266, 269], Grassman, Taksar and Heyman [94], Schweitzer, Put- 
erman and Kindle [271], Sheskin [284], Hunter [137, 138], Schweitzer and Kindle 
[270], Feinberg and Chiu [82], Haviv [113, 115], Haviv, Ritov and Rothblum [121], 
Sumita and Reiders [342], Mattingly and Meyer [219], Stewart, W. [341], Kim and 
Smith [168], Stewart [335, 336], Latouche and Ramaswami [210], Kartashov [156], 
Meyer [226], Haggstr6m [105], Bini, Latouche and Meini [46], Golub and Van 
Loan [93], Silvestrov, Manca and Silvestrova [317], Van Doorn and Pollett [346] 
and Silvestrov and Manca [315, 316]. In particular, we would like to mention some 
related books that are Romanovskii [262], Feller [83], Kemeny and Snell [162], 
Golub and Seneta [92], Seneta [276, 282], Berman and Plemmons [43], Silvestrov 
[298], Meyer [226], Haggstr6m [105], Bini, Latouche and Meini [46], Meyn and 
Tweedie [230], Hernandez-Lerma and Lasserre, [125], Gyllenberg and Silvestrov 
[104], Nasell [244], Avrachenkov, Filar and Howlett [30] and Collet, Martinez and 
San Martin [55]. 
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PageRank, a Look at Small Changes 
in a Line of Nodes and the Complete Graph 


Christopher Engstrém and Sergei Silvestrov 


Abstract In this article we will look at the PageRank algorithm used as part of the 
ranking process of different Internet pages in search engines by for example Google. 
This article has its main focus in the understanding of the behavior of PageRank as 
the system dynamically changes either by contracting or expanding such as when 
adding or subtracting nodes or links or groups of nodes or links. In particular we will 
take a look at link structures consisting of a line of nodes or a complete graph where 
every node links to all others. We will look at PageRank as the solution of a linear 
system of equations and do our examination in both the ordinary normalized version 
of PageRank as well as the non-normalized version found by solving corresponding 
linear system. We will show that using two different methods we can find explicit 
formulas for the PageRank of some simple link structures. 


Keywords PageRank - Graph - Random walk - Block matrix 


1 Introduction 


PageRank is a method in which we can rank nodes in different link structures such 
as Internet pages on the Web in order of “importance” given the link structure of 
the complete system. It is important that the method is extremely fast since there is 
a huge number of Internet pages. It is also important that the algorithm returns the 
most relevant results first since very few people will look through more than a couple 
of pages when doing a search in a search engine, [6]. 

While PageRank was originally constructed for use in search engines, there 
are other uses of PageRank or similar methods, for example in the EigenTrust 
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algorithm for reputation management to decrease distribution of unauthentic files 
in P2P networks, [14]. 

Calculating PageRank is usually done using the Power method which can be 
implemented very efficiently, even for very large systems. The convergence speed 
of the Power method and it’s dependence on certain parameters have been studied 
to some extent. For example the Power method on a graph structure such as that 
created by the Web will converge with a convergence rate of c, where c is one 
of the parameters used in the definition [11], and the problem is well conditioned 
unless c is very close to 1 [13]. However since the number of pages on the Web is 
huge, extensive work has been done in trying to improve the computation time of 
PageRank even further. One example is by aggregating webpages that are “close” and 
are expected to have a similar PageRank as in [12]. Another method used to speed up 
calculations is found in [18] where they do not compute the PageRank of pages that 
have already converged in every iteration. Other methods to speed up calculations 
include removing “dangling nodes” before computing PageRank and then calculate 
them at the end or explore other methods such as using a power series formulation 
of PageRank [2]. 

There are also work done on the large scale using PageRank and other measures 
in order to learn more about the Web, for example looking at the distribution of 
PageRank both theoretically and experimentally such as in [8]. 

While the theory behind PageRank is well understood from Perron—Frobenius the- 
ory for non-negative irreducible matrices [3, 10, 15] and the study of Markov chains 
[16, 17], how PageRank is affected from changes in the the system or parameters is 
not as well known. 

In this article we start by giving a short introduction on PageRank and some nota- 
tion and definitions used throughout the article. We will look at PageRank as the solu- 
tion to a linear system of equations and what we can learn using this representation. 
Looking at some common graph structures we want to gain a better understanding 
of the changes in PageRank as the graph structure changes. This could for example 
be used in finding good approximations of PageRank of certain structures in order 
to speed up calculations further. 

We will look at both the “ordinary” normalized version of PageRank as well as 
a non-normalized version we get by solving the linear system. We will see how this 
non-normalized version corresponds to the probabilities of a random walk through 
the graph and how we can use this to find the PageRank of some systems using this 
perspective rather than solving the system or computing the dominant eigenvector. 

Mainly two different structures, first a simple line in Sect. 5 and later a complete 
graph in Sect. 6 will be examined. In both cases we will see that we can find explicit 
expressions for the PageRank depending on the number of nodes. In both cases of 
the “ordinary” PageRank as well as a non-normalized version expressions for the 
PageRank will be found for both the structure itself as well as the PageRank after 
doing some simple modifications. 
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2 Calculating PageRank 


Starting with a number of nodes (Internet pages) and the non-negative matrix A with 
every element a;; 4 0 corresponding to a link from node i to node j. The value of 
element a;; = 1/n where n is the number of outgoing links from node i. An example 
of a graph and corresponding matrix can be seen in Fig. 1. 

By convention we do not allow any loops (nodes linking to themselves). We also 
need that no nodes have zero outgoing links (dangling nodes) resulting in a row with 
all zeros. For now we assume that none of these dangling nodes are present in the 
link matrix. This means that every row will sum to one in the link matrix A. 

The PageRank vector R we want for ranking the nodes (pages) is the eigenvector 
corresponding to the dominant eigenvalue with value one of matrix M: 


M=cA'+(1—c)ue', 


where 0 < c < 1, usually c © 0.85, A is the link matrix, e is a column vector of the 
same length as the number of nodes (7) filled with ones and u is a column vector 
of the same length with elements u;, 0 < u; < 1 such that ||u||; = 1. For u we will 
usually use the uniform vector (all elements equal) with u; = 1/n where n is the 
number of nodes. The result after calculating the PageRank of the example matrix 
for the system in Fig. | can be seen below: 


0.3328 
0.3763 
0.1974 
0.0934 


This can be seen as a random walk where we start in a random node depending 
on the weight vector u. Then with a probability c we go to any of the nodes linked to 
from that node and with a probability 1 — c we instead go to a random (in the case of 
uniform u) new node. The PageRank vector can be seen as the probability that you 
after a long time is located in the node in question [2]. More on why an eigenvector 
with eigenvalue | always exists can be seen in for example [7]. 


Fig. 1 Directed graph and 

corresponding matrix system (m1) (m2) 

matrix A 

010 0 
1/2 01/20 
1/31/3 0 1/3 


mn) 1 0 0 0 


A= 
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Role of c. 

Looking at the formula it is not immediately obvious why we demand 0 < c < | 
and what role c holds. We can easily see what happens at the limits, if c = 0 the 
PageRank is decided only by the initial weights u. However if c = | the weights have 
no role and the algorithm used for calculating PageRank might not even converge. 
As c increases, nodes further and further away have an impact on the PageRank of 
individual nodes. And the opposite for low c, the lower c is the more important is 
the immediate surrounding of a node in deciding its PageRank. The parameter c is 
also a very important factor in how fast the algorithms used to calculate PageRank 
converges, the higher c is the slower the algorithm will converge. 


Handling of dangling nodes. 


If A contains dangling nodes, corresponding row no longer sums to one and there 
therefor will probably not be any eigenvector with eigenvalue equal to one. The 
method we use in order to fix this is to instead assume that the dangling nodes 
link to all nodes equally (or some other distribution over the nodes). This gives us: 
T =A + gw’, where g is a column vector with elements equal to one for a dangling 
node and zero for all other nodes. Here w is the distribution according to how we 
make the dangling nodes link to other nodes (usually uniform or equal to u). In this 
work we always use w = u to simplify calculations. 

There are other ways to handle dangling nodes, for example by adding one new 
node linking only to itself and let all dangling nodes link to this node. Assuming 
w = u these methods should be essentially the same apart from implementation [5]. 


3 Notation and Definitions 


Here we give some notes on the notation used through the rest of the article in order 
to clarify which variation of PageRank is used as well as some overall notation and 
the definition of some common important link structures. We will repeatedly use the 
L' norm in comparing the size of different vectors or (parts of) matrices. 

First some overall notation: 


e Sg: The system of nodes and links for which we want to calculate PageRank, 
contains the system matrix Ag as well as a weight vector vg. Subindex G can be 
either a capital letter or a number in the case of multiple systems. 

e ng: The number of nodes in system Sg. 

e Ag: System matrix where a zero element a;; means there is no link from node i to 
node j. Non-zero elements are equal to 1/r; where r; is the number of links from 
node i. Size ng X ng. 

e vc: Non-negative weight vector, not necessary with sum one. Size ng x 1. 

e uc: The weight vector vg normalized such that ||ug||; = 1. We note that ug is 
proportional to Vg (Ug & VG). Size ng x 1. 

e c: Parameter 0 < c < | for calculating PageRank, usually c = 0.85. 

e gq: Vector with elements equal to one for dangling nodes and zero for all other in 
Sg. Size ng x 1. 
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e Mg: Modified system matrix, Mg = c(Ag + ggug)' + (1 —c)uge! used to cal- 
culate PageRank, where e is the unit vector. Sizeng x ng. 


In the cases where there is only one possible system the subindex G will often be 
omitted. 

From earlier we saw how we could calculate PageRank for a system S, we also 
make the assumption that w = u both since it simplifies calculations as well as having 
no obvious disadvantages since both vectors play largely the same role in that they 
can be used to penalize or promote certain certain nodes. 

We will use two different ways to define different versions of PageRank using the 
notation RO where f is the type of PageRank used, G is the graph or part of graph 
for which R is the PageRank. Often G is the whole graph in which case the subindex 
is usually omitted R. 

We will sometimes give the formula for a specific node 7 in this case it will be 
noted as RY . When normalizing the resulting elements such that their sum equal to 
one we get the traditional PageRank: 


Defnition 1 Re for system Sg is defined as the eigenvector with eigenvalue one to 
the matrix Mg = c(Ag + ggug)' + 1 —c)uge'. 


Note that we always have ||R“||,; = 1 and that non-zero elements in RY are all 
positive. The fact that ||R“ ||; = 1 is generally not the case in other versions of 
PageRank. When instead setting up the resulting equation system and solving it we 
get the second definition, the result is multiplied with 7g in order to get multiplication 
with the one vector in case of uniform Ug. 


Defnition 2 RY for system Sg is defined as RY =(l- cAt)!ngug 
We note that generally ||R® ||; 4 1 as well as RP # ngRY unless there are no 


dangling nodes in the system. However the two versions of PageRank are proportional 
to each other (R oe Re), 


Defnition 3 A simple line is a graph with n; nodes where node nz links to node 
n,— Which in turn links to node n;_» all the way until node nz link to node n;. 


The link matrix A; and graph for system S;, consisting of a simple line with 5 nodes 
can be seen in Fig. 2. 


Defnition 4 A complete graph is a group of nodes in which all nodes in the group 
links to all other nodes in the group. 


The link matrix Ag for system Sg consisting of a complete graph with 5 nodes can 
be seen in Fig. 3. 


00000 
10000 

Mo ng Ma 15 Az, = | 01000 

00100 

00010 


Fig. 2. The simple line with 5 nodes and corresponding system matrix 
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Fig. 3. A complete graph 
with five nodes and 


corresponding system matrix ra a 
(m1) O1111 
1 10111 
Ag = | Holl 
(*s) tun 


11110 


4 Calculating Non-normalized PageRank 


While ordinary normalized PageRank R") is usually calculated using the Power 
method or some other similar iterative method, in order to find nice analytic forms 
using non-normalized PageRank we will use a number of different ways to calculate 
it. From now we will assume uniform u which simplifies calculations significantly. 

In this article we will look at two methods two calculate PageRank (R®), while 
neither method is especially useful for calculating PageRank of large systems, they 
give exact answers as compared to the usual iterative methods. The goal is to use 
these in order to learn something about the behavior of some common typical graph 
structures within a system. From earlier we have: 


R® = MR” = (c(A+ gu')’ + (1 —c)ue")R™. (1) 
Calculating the dominant eigenvector R“” is the same as solving the linear system: 
R® = MR® & (cA — DR = -(cug! + (1 — c)ue')R™. (2) 


Since every column of ug! is either equal to u or zero and all columns equal to u 
for ue’ we can see that —(cug’ + (1 — c)ue')R“ will be proportional to u. This 
can be written as: (cA' — NDR” = ku. 

We choose k = —n in order to get ku equal to the one vector in the case of uniform 
u, the minus sign to get positive rank and solving the system we get: 


R® = (l— cA‘)! nu. (3) 


To get the rank to sum to one it is a simple matter of normalizing the result. 
R® = R® /||R||, [5]. We note the similarity with this formulation of PageRank 
(solution to R® = cA7R® + nu) with the one for the potential of a Markov chain 
with a discounted cost (solution to R® = aAR® +c), where 0 <a <1 is the 
discount factor and c is a cost vector, [16]. 

Note that we do not need to take any care of the dangling nodes when calculating 
the PageRank in this way but obviously a lot slower than using the Power method or 
other conventional methods of calculating PageRank since we need to invert a large 
sparse matrix. Although we do not need to change A for dangling nodes, the result 
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when doing so is changed (but still proportional to R). We will never change A 
for dangling nodes when solving the linear system and only use the version defined 
above. Note that while solving the equation system is slow it is possible to get to this 
or similar non-normalized version of PageRank using another PageRank algorithm, 
such as using a power series formulation as in [1] or by first calculating the ordinary 
normalized PageRank and then scale it appropriately [9]. 

The following theorem explains how PageRank (R®) can be computed and how 
it can be interpret from a probabilistic viewpoint using random walks on a graph and 
the hitting probabilities of said random walks. 


Theorem 1 Consider a random walk on a graph described by cA described as 
before. We walk to a new node with probability c and stop with probability 1 — c. 
PageRank R® of a node when using uniform u can be written: 


RY = >» P(e; —_> ej) + 1 (> (P(e; —_> «») , (4) 


e,€S,e; Fe; k=0 


where P(e; — e;) is the probability to hit node e; in a random walk starting in node 
e; described as above. This can be seen as the expected number of visits to e; if we 
do multiple random walks, starting in every node once. 


Proof (cA); is the probability to be in node e; starting in node e; after k steps. 
Multiplying with the unit vector e (vector with all elements equal to one) therefor 
gives the sum of all the probabilities to be in node e; after k steps starting in every 
node once. The expected total number of visits is the sum of all probabilities to be 
in node e; for every step starting in every node: 


ke = (> ay) ‘) (5) 
k=0 7 


9 (cA) is the Neumann series of (| — cA')~! which is guaranteed to converge 
since cA' is non-negative and have column sum < 1. If u is uniform we get by the 
definition: 


R® = (l—cA')'nu = (l— cA") 'e = ® cay )e (6) 


k=0 


=RP=( DY Pe >e)+1 (Sr > eo), (7) 


e€S,e; Fe; k=0 
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5 Changes in the Simple Line 


Using the simple line as defined earlier we recall that we had the link matrix with an 
image of the system in Fig. 2 


00000 
10000 
A= |]01000 
00100 
00010 


By setting up the system of equations we get the inverse (| — cA')~! as: 


(l—cA')'=]| 00 


Note that this needs only to be multiplied with nu or a multiple of u for us to get 
a meaningful ranking. This gives us R® (for uniform u): 


R@ Sfi4+c4C4oeC 4 lécte te, iteter,1+e1)'. 


If wanted to get the common normalized ranking R“” we need to normalize the 
result to sum to one. Looking at the elements a;; of (I — cA')~! and considering 
the example with a random walk through the graph, we can see the value of every 
element a;; as the probability to get from node e; to node e;. In the case where the 
link matrix contain nodes with paths back to itself we will later see that it is actually 
not the probability to get there but the sum of all probabilities to get from e; to e; 
corresponding to Theorem |. We can motivate this further by looking at the same 
line but adding a link back from the first node to the second node. 


5.1 The Simple Line with Node One Linking to Node Two 


Letting node one link to node two in the earlier example gives us the graph in Fig. 4. 
The resulting inverse can be written as 


S SC sc? so sct 


sc S SC sc so 


d—cA t=) 0-0 1 © 2 |, 
000 1c 
600 6 i 
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01000 
10000 
eo M4 j- es A = |01000 
00100 
00010 


Fig. 4 Simple line where the first node links to the second and corresponding system matrix 


where 5 = > 'po9 ct = —- is the sum of all the probabilities of getting from node 
1 or 2 back to itself. 
From this we can see that the following observations seem to be true. 


e The sum of a column c; is at most 29 ck = * when using uniform u, with 
equality if there are no paths to any dangling node from node j and node j is not 
a dangling node itself. 

e A diagonal element is equal to one if the node have no paths leading back to itself. 

e Setting one element in u; to zero only effects the influence of a random walk 
starting in the corresponding node. 

e Every non zero element in the same row can be written as the diagonal element 
on the same line times the sum of probabilities of getting from all other nodes to 
the node corresponding to the current line. 

e Each element e;; of (| — cA')~! contains the sum of probabilities of all paths 
starting in node j and ending in node 7. When doing a random walk by choosing 


a random link with probability c and stopping with probability 1 — c. 


This is consistent with the statement that the normalized PageRank RY of a node 
is the probability that a surfer that starts in a random node (page) and keeps clicking 
links with probability c or starts at a new random page with probability (1-c) is 
in a given node. However here we can explicitly see all the probabilities and their 
influence on the ranking, [7]. 


5.2 Removing a Link Between Two Nodes 


When removing a link between two nodes in the simple line we end up with two 
smaller disjoint lines instead. We note that these could be calculated separately and 
we would still have the same relation between them. This is interesting since when 
using the “Power method” or straight calculating R“ this is not possible since more 
nodes in a system obviously means a lower mean rank since we in that case normalize 
the result to one. 

Especially in the inverse (1 — cA')~! we see that when we remove one link, we 
remove all the elements in the upper right corresponding to paths from nodes above 
the removed link to all the ones below it. An example of what the new inverse looks 
like when removing the link between the third node and the second node in Fig. 2 
can be seen below: 
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1c000 
01000 

(l—-cA') 1! =] 001cc? |, 
0001c¢ 
00001 


with PageRank: R® = [1 +c, 1,1+¢+c?,1+.c, 1] and normalizing constant 
N =5+3c+c’, when using a uniform u. 


5.3 Adding a New Node Pointing at One Node 
in the Simple Line 


A more interesting example is when looking at what happens when we add a single 
new node, linking to one other node in the simple line. Since we make no changes in 
the line that part of the inverse will stay the same. We will however add a new row 
and column. The non diagonal element of the new column can be found immediately 
as c times the column corresponding to the node our new node links to. This since 
we got the probability c to get to that node instead of 1 when we start in it. At last 
we need to add the one at the new element in the diagonal. An example of what the 
inverse looks like after adding a new node pointing at node 3 in Fig.2 can be seen 
below 


(l—-cA')y = 


From this easy example we can immediately get an expression for the PageRank 
of a simple line with one or more added nodes linking to any of the nodes in the 
simple line. 


Theorem 2 The PageRank of a node e; belonging to the line in a system containing 
a simple line with one outside node linking to one of the nodes in the line when using 
uniform weight vector u can be written: 


nyt np—i+l 


. 1- 
R” = ck + bij = ———— + bij (8) 
p / F = J 
k=0 


citht j > i 
bij = iS 7 <i (9) 
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where nz is the number of nodes in the line and the new node link to node j. The 
new node has rank |. After normalization we get the PageRank of node i as: 
[—c"L-it! 
RO = fae 8H 
‘ np t1+(nz — Vet (ny — 2)c2? +--+» Feom—t + 


(10) 


—ci 
l-—c 


where RY, R” is the PageRank of one of the nodes in the original line, L is the 


L 

number of nodes on the line, j is the number of the node linked to be the new node. 
Additionally adding new nodes linking to the line means adding additional b;; 
l—c! 
l—c 


parts and adding the corresponding part to the normalizing constant. 


Proof From Theorem | PageRank for a node when using uniform u can be written 
as: 


RP=( Dd Perret) (Sr > nt), 
k=0 


ej €S,e; Fe; 


where P(e; — e;) is the probability to hit node e; starting in node e;. When we 
consider a random walk on a graph given by cA described as before. We walk to a 
new node with probability c and stop with probability | — c. 
The probability of getting to any node e; in the line from any other node e; in the 
line once is: 
Plej>e)acl", j>i, (11) 


and zero otherwise. Summation over all j > i gives 


ait — pnyp—-it+l 


1 
> Pej > a)ti= >t +1=——_, (12) 
ej €S,e; Fe; k=1 = 


where L is the number of nodes in the line. With the first part shown we need to 
show that the single outside node linking to node e; adds b;; = c/t!', j => i. We 
get this probability in the same way by instead looking at the line created by the first 
j nodes plus the extra node added linking to node j. We get the probability to reach 
node e; as c and then c? for the next and so on. If i > j, e; does not belong to this 
line, and we obviously cannot reach it from e; hence bj; = 0, i > j. 

Last the PageRank of the “outside” node linking to a node in the line is obviously 
1 since no node links to it. The normalized PageRank is found by dividing R® with 
RO |[1.- 


We also give a proof using matrices but first we will need the following lemma 
for blockwise inversion used repeatedly throughout the article. We note that we label 
the blocks from B to E rather than from A to D in order to avoid confusion with the 
system matrix A. 
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Lemma 1 
BC] [ (-CE'D)! —(B— CE"'D)"!CE"! 
DE ~ |—-E-'D(B— CE“!D)"!_ E-!+E-'D(B-— CE7!D)"'!CE"! |’ 
(13) 
where B, E is square and E, (B — CE~!D) are nonsingular. 
Proof To prove the Lemma it is enough to show that: 
BC (B— CE“!D)7! —(B—CE7!D)"'!CE"! a4 
DE|]}—E-'D(B—CE"!D)*!_ E-!+ E-'D(B— CE“!D)“!CE"!|] ~ * 
(14) 
Looking at the result blockwise we get: 
B(B — CE7!D)~! — CE~'D(B — CE“!D)7! = (15) 


= (B— CE"!D)(B— CE™!D)"! =1, 


— B(B— CE“'D)-'CE"! + C(E“! + E-'D(B — CE“!D)“'CE“') = (16) 
= CE"! — (B— CE“'D)(B— CE“!D)“'CE-! = CE"! - ICE"! = 0, 


D(B — CE“!D)"! — EE“'D(B — CE“!D)"! 


= (17) 
— D(B — CE~'D)-! — D(B — CE“'D)-! = 9, 


— D(B — CE"'D)"'CE?! + EE! + E"'D(B — CE"'D)"'CE“!) = (18) 
= —D(B — CE“'D)-'CE“! +1+ D(B— CE“'!D)“'!CE“! =I. 


This gives: 
BC (B — CE“!D)7! —(B—CE~!D)-!CE"! = (19) 
DE || —E-'D(B— CE"'!D)*!_ E-!+ E-'D(B— CE“!D)"!CE“! | — 


Furthermore we need that E and (B — CE™'D) is nonsingular in order for the 
matrix to be invertible, [4]. 


When using Lemma | we will denote the individual blocks off the inverse matrix as 
described in Definition 5. 
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Defnition 5 Given a block matrix M we denote the inverse as: 


inv inv inv 
Mii Mi.2 see Min Mr M5 eis Min 
inv inv inv 
Mo,1 Mo,2 ... Mon 3 Mey My... May 
=P oe, — . |-MvU=] . . ; (20) 
Mint Min2 tee Minn Mri Mino ease Me, 


We can now give a matrix proof of Theorem 2 as well. 


Proof (Proof of Theorem 2) We let B be the part of the matrix (Il — cA!) correspond- 
ing to the nodes in the line which gives: 


(— AT) = E iL (21) 


We write 


Binv cm (22) 


Ty-1 _ . , 
(I ~ cA ) —_ Ee sav 


Using Lemma 1 for blockwise inverse we get B®’ = (B— CE~'D)"! = B™!. 
Since B is the matrix for the simple line found earlier we get: 


Le ec? ...ch 
01 ¢...ch 
BY = (—cAT) 1 =] 90 1... | (23) 
Ole sck gee 0 1 
where L is the total number of nodes in the line. C = [0 ... c 0... O]' where the 
non-zero element c is at position j gives: 
C™— BCE?! =—BC =[e ef... 60... 0): (24) 


Last, since D = 0 we get D*” = 0, E™” = 1. Since the weight vector u is uniform 
we get the PageRank of a node as the sum of corresponding row in (I — cA')~!. For 
the nodes in the line we get PageRank: 


R? = Pity ct + by = OE + by, (25) 
citht j > i 
bij = 0, j <i (26) 


where the sum is the sum of the first 1, values and b;; is the value on the last column. 
For the last row we obviously get sum 1. 
We get the normalized PageRank R™ by dividing R® = R® /|[R||,. 
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6 Changes in a Complete Graph 


Complete graphs or similar structures are common both as parts of a site and as a 
way between different sites to try and gain a better rank. An image of a complete 
graph with five nodes can be seen in Fig.3. We recall that the system matrix for this 
system is: 


O1l11 
10111 
A=-|/11011 
11101 
11110 


Using this we get the inverse of this system as: 


(I ee Oe ie = —c . — 3c—4 —c : —c 


c24+3c—4 c?4+3c—4 c2+3ce-4 c24+3c—4 c2+3c—4 


After normalization we will obviously end up with RY = 1/5 as PageRank for 
every node i. However since there is not any dangling nodes in the complete graph all 
the nodes will have maximum influence on the PageRank of the system. Additionally 
since they only point to each other they will not share any of it with the outside in 
the case of a bigger link matrix with a part of it being a complete graph. This makes 
a complete graph similar to a dangling node in that it will not increase the rank of 
anyone else, but with the addition of having a higher rank in itself since it can increase 
its own rank to a certain extent. 

Trying to find an expression for the elements in the inverse (| — cA')~! for the 
complete graph we formulate the following lemma: 


Lemma 2. The diagonal element aq of the inverse (| — cA')~! of the complete graph 
with n nodes is: 


(n — 1) —c(n — 2) 


= . 27 
1 Gah =e = 2) =e 27) 
The non diagonal elements a;; can be written as: 
é 
aij = (28) 


(n — 1) —c(n — 2) — cc?" 


Proof The diagonal element is the sum of the probabilities of all paths to node ey 
from itself. This can be written as a geometric sum: ag = ba P(eg > ea) , where 
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P(eq — eg) is the probability of getting from node eg to node ey. The probability 
P(eqg — eq) can be written as: 


= C(n—2)  ct(n—2)? 
PRCT = ee eye Ga ea 
= a c(n — 2) a C2 
= 2 n—1 ) ~ (#—1)—cH—2) 
This gives: 
= 2 : === 3) 
u=>(— Ts) lt ee eae 


k=0 


For non-diagonal elements e;; we get e;; = P(e; > e;)dga, where P(e; — e;) 1s 
the probability of getting from node e; to node e; where e;  e;. This probability 
can be written as: 


6 C(n—2) (n—2)? 


Eg ie le ie 
— e QO fcetn—2)\* _ c 
=5h( o) - 
This gives: 
_ Cc (n — 1) —c(n — 2) _ c 
“1 G@ Nea —-Da—-N—-ca—-D—-2. @—N-cen—-D-e’ 
(32) 


We give a matrix proof of Lemma 2 as well: 


Proof (Proof of Lemma 2) We consider a general matrix A of the form: 


laa...a 
ala...a 


aaa...1 


We use Gauss-Jordan elimination to find the inverse A7!: 
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..al00... 
.a0l1dO0... 


0 
0 
aal<.a0017:.0 
1 


Q 
—- 2 
QQ 8 


aaa...1000... 
We add —ar, where /; is the first row to every other row to eliminate the elements 


below | on the first column: 


1 a GQ fue. “Ed 100... 
Ol-a@a—-a...a—a—-al 0... 


0) 
0 
Oa-a 1-a?":.a-a?-a01":.0 
a. a rn) 
Oa-aa-—a...1—a*-a00...1 


Next we eliminate the values to the right of the | on the first row. We add 
—k>“"_,1;, where r; is row i to the first row giving the equation: 


a = —k(1 —a* + (n— 2)(a —a’)) (33) 
—a 
(1 — a2 + (n — 2)(a — a?)) 
This gives: 
1 O Oo... O l-(—l)akk k ...k 
0O1l-a@a-a@...a—a —a 10...0 
0Oa—a*? 1l—a':.a—a? —a 01 °.0 
. : : : 0 
Oa-ada-a...1—-a —a 00...1 


We are now done calculating the first row of the inverse A~!. We get the other rows 
using the same calculations if we start with another pivot element. For the inverse 
matrix we get diagonal elements d = | — (n — 1)ak and for all other elements e = k, 
where n is the total number of rows giving a inverse like below: 


1 — (n= lak k k a k 
k 1 —(n— lak k _ k 
Als k k 1—(n— ak ":. 
: ; ‘ 


k k ves k 1—(n— ak 
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Calculating for a = —c/(n — 1) as for a complete graph gives: 
k= — = 34 
~ (l-a+(n—2)(a—a?)) = 1) = (n= 2)c - ce?’ - 
2 
Pe ee (n—1)—(n—2)c—c (n — 1)(—c)/( — Ic (35) 


(n—1)—(n—-2)ce-—¢? 
(n — 1) —(n—2)c 
(n—1)—(n—2)ce—c? 


And the proof is complete. 


Using this we immediately get the PageRank (before normalization) of elements 
in a complete graph with uniform u: 


Theorem 3 Given a complete graph with n > 1 nodes, PageRank R® before nor- 
malization can be written as: 


1 
RS, (36) 
l-c 


Proof From Lemma 2 We already have the inverse (| — cA')~!, We then find the 
PageRank by summation of any row of the matrix (since all rows have equal sum). 


R® =ag+(—Dayj, i#j, BN 
_ m= )—ee—2+eu—1) _ c+(n—1) _ 1 


@—-l-@@—-D-8 ~ @-)=ce—2—e2 lee 


We do note that since we have no dangling nodes all the probability from a node in 
the complete graph is distributed within the complete graph. Also the size of the graph 
is irrelevant for the individual nodes as long as none are linked to from outside sources 
and it consists of at least two nodes. In the R“ sense the size obviously changes 
the result since we would increase the overall number of nodes in the system by 
increasing the size of the complete graph. Two things is important to note however: 
The higher ones own PageRank before joining the complete graph (probability of 
getting there from outside nodes) the more gain there is by joining a small complete 
graph in order to maximize the probability of returning to itself. In the same way if a 
node have a very low rank it gains much by joining a large complete graph of nodes 
with higher rank than itself. 
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6.1 Adding a Link Out of a Complete Graph 


If we want to see how the complete graph changes when adding one link from one 
node (node one) out of the complete graph we end up with the following system 
matrix for the nodes in the complete graph: 


1 c/4 —c/4 —c/4 —c/4 
—c/5 1 —c/4 —c/4 —c/4 
(l—cA') =] -c/5-c/4 1 -c/4—c/4 
—c/5 —c/4—c/4 1 —c/4 
c/5 —c/4 —c/4—-c/4 1 


After taking the inverse and multiplying with —1 we get: 


(l-cA')!= 
I5e-20 =5e Se =5e =Se 

s Ss Ss Ss Ss 
—4e — 12c?+40c—80 4c(5+c) 4c(5+c) 4c(S+e) 

Ss (c+4)s (c+4)s (c+4)s (c+4)s 
—4c 4c(Ste) — 12c?+40c—80 4c(5+c) 4c(5+c) 

Ss (c+4)s (c+4)s (c+4)s (c+4)s ’ 
—4c 4c(5+c) 4c(5+c) 12c?+40c—80 4c(5+c) 

S (c+4)s (c+4)s (c+4)s (c+4)s 
—4e 4c(5+c) 4c(5+c) 4c(5+c) 12c?+40c—80 

Ss (c+4)s (c+4)s (c+4)s (c+4)s 


where s = 4c? + 15c — 20. 
We find the expression for the PageRank in a complete graph with one node 


linking out to be the following assuming uniform u. 


Theorem 4 The PageRank of the nodes in a complete graph with the first node 
linking out of the complete graph, the PageRank can be written as: 


Q) n(n — 1) +nc 
in n(n — 1) — (n— 1c? — n(n —2)c’ (38) 
R® = came) n>i>l, (39) 


n(n — 1) —(n— 1)c? — n(n — 2)c’ 


where n is the number of nodes in the complete graph and node one links out of the 
complete graph. 


Proof We start by looking at the PageRank as a probability, we let e; be the node 
linking out. The probability to get from e; back to itself is: 


PageRank, a Look at Small Changes in a Line of Nodes and the Complete Graph 241 


c(n—1) ¢ c(n—1) c c(n—2) 


P(e) > e\)= + (40) 
n n—-1 n n—-1 n-1 
comb c (Sy ae 
n n—1 n—-1 
& = (0) -< n—-1 
> n—1 ~ n (n—1)—c(n— 2) 


And we get the sum of all probabilities from e; back to itself as: 


SP a idl 41 
2 (e; > e1)) SU reer, on 


_ n((n — 1) —c(n — 2)) _ Binv 
~ n((n— 1) —c(n—2))—C2(n—1) , 


We remember that on the diagonal of (I — cA"), we have the sums of probabilities 
of nodes going back to themselves. So if we divide the matrix (1 — cA") in blocks: 


and inverse matrix: ; ; 
= By Cinv 

(I — cA') — ny i 
Db Ev 


We note that B'”” is not the inverse of B but the part of the inverse (1 — cA')~! 
corresponding to block B. We let B = [1] corresponding to the node linking out and 
we get B’”” as above. 

For the elements Ci"", i 4 1 of C’”” we find them as 


cin’ = oS (P(e; > e,))* > (P(e, > e,))* (42) 


k=0 k=0 
oe = c(n — 2) ‘ a cn 
=52( n= je ~ n((n— 1) —c(n —2)) —C2(n—- 1)" 


Since E and DB~!C are both symmetric and have every non-diagonal element 
equal as well as all diagonal elements equal, the inverse E’"” = (E — DB~'C)7! 
should be the same as well. Especially every row and column have the same sum. 
From Lemma | for blockwise inversion we get: 


n—1 


ch = -—_ Be, (43) 
k=1 


n—I1 
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n—-1 n—-1 
i =e inv Te inv 

pe = —__ ys Ey = —— > Ey; ; (44) 

’ k=1 se k=1 

pin _ (n _ He . 

n 
> n—l ; (n = He’ (45) 
k=1 
We get the PageRank as: 
= iaeGpao 
R? =p (n = He = n((n ) c(n )) (46) 
n((n — 1) —c(n — 2)) —c2(n — 1) 
(n — 1)cn n(n — 1)+nc 


"Ae l)-—can—2))—2n—-1)) nn—1D)—(n— Det -n(n—2)c’ 


ae = ah Se _—pcim 
R? — pin cee Sat jo — ™ Hern, pi (n oe (47) 
— @=)C"(c+n) (c+n)(n—1) 
~ nc ™ n(n—1)—(n—1)c?—n(n—2)c * 


And the proof is complete. 


We give a matrix proof of Theorem 4 as well: 


Proof (Proof of Theorem 4) We consider the square matrix A with n rows: 


laa...a 
bla...a 


A=|bal-".al, 
baa...a 
where a = —c/(n — 1), b = —c/(n). We divide the matrix in blocks: 
BC 
a=[5 E|: 


where B = [1], C=[aa... a], D=[bb ... b]' and E looks like the matrix for 
a complete graph but of size (n — 1) x (n — 1): 
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laa...a 
ala...a 
E=]jaa 1 a 
aaa a 


In the same way as in the proof of Lemma 2 we find the elements of B, C by 
choosing the top left element as pivot element. This gives 


—a 
k= (1 — ab) + (n —2)(a— ab)’ Oe) 


We write A! as blocks: 


Biv cin 
A — Ee Ein ? 
and get: B'’Y = 1 — (n — 1)bk, and CY =k,. 
From the matrix proof of Lemma 2 we get the non-diagonal elements EF, and 
diagonal elements Ey of E~! as 


Bakes —a = (n— Ie 
eR 1-a)+(n—3)(a—-@) (n—D2?—-M—3)\(n—-De—-M—De2’ 
(49) 
_ _ (n — 1)? — (n—3)(n— 1)e 
a A el a RY oR LS 
From Lemma | we then get: 
B’’ = (B—CE™'D)! =1-—(n— 1l)bka, (51) 
Cn =}-(B = CE py ‘ce (52) 
=> Ci” = —(B — CE"'D)"'b(Ez + (n—2)E.) = ka, 
D’’ =-E' pie -CE' bp)" (53) 
inv =I —l bka 
=> Di" = —a(Eq + (n— 2)E.)(B—- CED) = _, 
a 


Ew — p14 E~'D(B — CE pce, (54) 


244 C. Engstrém and S. Silvestrov 


has = Eq + b(Eq + (n _— 2)E.)(B = CE!D)"!a(Ea + (n _— 2)E.) 
= Eq —b( Eg + (n—2)E.)ka 


= inv -l -l (55) 
EL = E. + b(Ea + (n — 2)E-)(B — CED) a(Eg + (n — 2)E-) 
= E, — b(Eq + (n — 2)E.)Ka 
We replace a = —c/(n — 1) and b = —c/n as for our complete graph and get 
inverse: 
(l—cA')"! = 
1—(n— 1)bka lan Ka site Ka 
Ph 1—(n—2)akp kp is kp 
= Pha kp t—(%= Deka ~ kp 
BEA in kp ... L—(n—2)akp 


For the PageRank of the node linking out we get: 


RY? = Bi" + — 1c)" = (56) 
=1-—(n—1)bkat+(n—Dkg =1—(n— 1)(b— Dka 
(1 — ab) + (n — 2)(a — ab) + (n— 1)(b—- la 
(1 — ab) + (n — 2)(a — ab) 
(1 — ab) — (a— ab) — (n—l1)a n(n — 1)+ cn 


(1 — ab) + (n — 2)(a — ab) ~ n(n 1) —n(n — 2)e— (n— Ic? 


For all other nodes we get PageRank: 
R®? = DIM + EY 4 (n-2E” = (57) 
bka 
= Eq — (Eg + (n — 2)E.)ka 


+(n — 2)E. — (n — 2)b(Eq + (n — 2)Ee)Ka 
= Eqt+ (n—2)E, — (n— 1)b(Eag + (n — 2) Ee )ka + (b/ayka 
l-a —b 
a (1 — a?) + (n — 3)(a — a?) ms (1 — ab) + (n — 2)(a — ab) 
(n — l)ab(1 — a) 
(1 —a?)+ (n—3)(a — a") (1 — ab) + (n — 2)(a — ab)) 
1—b (n — 1)(n+c) 
~ 1 —ab+ (n — 2)(a — ab) ~ n(n — 1) —n(n — 2)e — (n — 1)c? 


am 


And the proof is complete. Oo 


Just looking at the expression it is hard to say how the PageRank changes after 
linking out. We can however note a couple of things: First of all the PageRank is 
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lower than for the complete graph (since we now have a chance to escape the graph). 
But more interesting, when comparing the node that links out with the others we 
formulate the following theorem: 


Theorem 5 [na complete graph not linked to from the outside but with one link out, 
the node that links out will have the highest PageRank in the complete graph. 


Proof Using the expression for PageRank in a complete graph with one link out we 
want to prove Re > R” where R? is the PageRank for the node linking out and 


R” is the PageRank of all the other nodes. 


RS ke 
n(n — 1) +ne (c +n) (n— 1) 
n(n — 1) —(n— 1)c? — n(n De nr 1) — (n— 1)c? —n(n — 2)e 
&n(n—1)+ne > (c+n)(n— 1) 


on+ne—n >n+nce—n—c 

(58) 

where 0 < c < | andn > 1 is the number of nodes in the complete graph. This is 
obviously true and the proof is complete. 


Apart from the knowledge that it is the node that links out of a complete graph 
that loses the least from it we can also see that as the number of nodes in the complete 
graph increases the difference between them decreases since we have a factor n* in 
the denominator compared to only a difference of c in the nominator. 


6.2 Effects of Linking to a Complete Graph 


In the case of a link to a complete graph without a link back from the complete 
graph we can easily guess the result. From earlier we know that for a node linking to 
one other node in a link matrix with no change of getting back to itself the column 
corresponding to the node linking out is c times the column of the node it links to. 
Additionally we need to add a one to the diagonal element for that column. 

The fact that there is no probability (or a very low if it is only close to complete) to 
escape the complete graph and give any advantage to other parts of the system means 
the complete graph as a whole get maximum benefit from the links to it. Looking at 
how the additional probability c/(1 — c) =c +c? +c? +--++c® get distributed 
within the complete graph we realize that the node linked to gains all of the initial c!, 
then loses a part c? distributed among all other nodes in the complete graph, after that 
the rest is close to evenly distributed between all the nodes in the complete graph. 
As such we see that the node linked to is the node which gains the most from the 
link (which is what we would expect). 
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7 Conclusions 


We have seen that we can solve the resulting equation system instead of using the 
definition directly or using the Power method. While this method is significantly 
slower it has made it possible to get a bigger understanding of the different roles of 
the link matrix A and the weight vector u. We have seen how PageRank changes 
when doing some small changes in a couple of simple systems. For these systems 
we also found explicit expressions for the PageRank and in particular two ways to 
find these. Either by solving the equation system itself or by using a probabilistic 
perspective and calculate: 


CO 
RP =( So P(e >e)+1)( > (Pe, > e))* J, 
k=0 


erES, es Fey 


where P(e; — @,) is the sum of probability of all paths from node e; to node e, and 
the weight vector u is uniform. 

One of the main advantages in using non-normalized PageRank over the ordinary 
normalized version is that it is possible to split a large system into multiple disjoint 
systems and calculate R® for every subsystem separately, something which cannot 
be done as easily. 
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PageRank, Connecting a Line of Nodes 
with a Complete Graph 


Christopher Engstrém and Sergei Silvestrov 


Abstract The focus of this article is the PageRank algorithm originally defined by 
S. Brin and L. Page as the stationary distribution of a certain random walk on a graph 
used to rank homepages on the Internet. We will attempt to get a better understanding 
of how PageRank changes after you make some changes to the graph such as adding 
or removing edge between otherwise disjoint subgraphs. In particular we will take a 
look at link structures consisting of a line of nodes or a complete graph where every 
node links to all others and different ways to combine the two. Both the ordinary 
normalized version of PageRank as well as a non-normalized version of PageRank 
found by solving corresponding linear system will be considered. We will see that it 
is possible to find explicit formulas for the PageRank in some simple link structures 
and using these formulas take a more in-depth look at the behavior of the ranking as 
the system changes. 


Keywords PageRank - Graph - Random walk - Block matrix 


1 Introduction 


PageRank was initially used to rank homepages (nodes) based on the structure of 
links between these pages. This is important in order to return the most relevant 
results in for example search engines. Since the number of pages on the Internet is 
huge and ever increasing it is important that the method is extremely fast but there 
is also a heavy requirement on the quality of the ranking since very few people 
will look through more than a couple of the highest ranked pages when looking for 
something, [4]. 
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Some other applications of PageRank or similar methods include the EigenTrust 
algorithm used for reputation management in P2P networks [12] and DebtRank for 
evaluating risk in financial networks [1]. 

Calculating PageRank is usually calculated using the Power method which can 
be implemented very efficiently, even for very large systems. The convergence speed 
of the Power method and it’s dependence on certain parameters have been studied 
to some extent. For example the Power method on a graph structure such as that 
created by the Web will converge with a convergence rate of c, where c is one of 
the parameters used in the definition [9], and the problem is well conditioned unless 
c is very close to 1 [11]. Many methods have been developed in order to speed up 
the calculations of PageRank such as by aggregating webpages that are “close” and 
are expected to have a similar PageRank as in [10] or by partitioning the graph into 
different components as in [6]. 

There are also work done on the large scale using PageRank and other measures 
in order to learn more about the Web, for example looking at the distribution of 
PageRank both theoretically and experimentally such as in [5]. 

While the theory behind PageRank is well understood from Perron—Frobenius 
theory for non-negative irreducible matrices [2, 8, 13] and the study of Markov 
chains [14, 15], how PageRank is affected from changes in the system or parameters 
is not as well known. We will start by giving some necessary definitions as well as 
describing the notation used throughout the article. Before continuing to the main 
part of the article we will also give some previous results described in [7] which will 
be needed throughout the rest of the article. As in said previous work we will consider 
PageRank as the solution to a linear system of equations as well as probabilities of a 
random walk through the graph and see what we can learn from this. Both ordinary 
normalized PageRank as well as non-normalized PageRank will be considered and 
we will highlight some differences between the two as parameter c or the size of the 
graph changes. In this article we will look at how PageRank changes as we combine 
a line of vertices with edges in one direction with a complete graph in different ways 
in Sect.3. And after that in Sects.4 and 5 we will take a closer look at the found 
formulas for some of the examples mainly by looking at partial derivatives of the 
PageRank. We will see one of the possible reasons why c is usually chosen to be 
around c ¥ 0.85. PageRank for some nodes increases extremely fast while for some 
other nodes decreases extremely fast for larger c, while for lower c the difference 
in PageRank between nodes is smaller the lower c gets and the initial weight vector 
have a much larger influence on the final ranking. 


2 Notation, Definitions and Previous Results 


We will start by describing the notation used throughout the article as well as describ- 
ing some common link structures which will be used. At the end of this section we 
will give a couple of lemmas and theorems without proofs summarizing previous 
results. 
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First some overall notation: 


e Sg: The system of nodes and links for which we want to calculate PageRank, 
contains the system matrix Ag as well as a weight vector vg. Subindex G can be 
either a capital letter or a number in the case of multiple systems. 

e ng: The number of nodes in system Sg. 

e Ac: System matrix where a zero element aj; means there is no link from node i to 
node j. Non-zero elements are equal to 1/7; where 7; is the number of links from 
node i. Size ng X Ng. 

e vc: Non-negative weight vector, not necessary with sum one. Size ng x 1. 

e uc: The weight vector vg normalized such that ||ug||; = 1. We note that ug is 
proportional to Vg (UG & Vg). Size ng x |. 

e c: Parameter 0 < c < 1 for calculating PageRank, usually c = 0.85. 

e g,: Vector with elements equal to one for dangling nodes and zero for all other in 
Sg. Size ng Xx 1. 

e Mg: Modified system matrix, Mg = c(Ag + gcuz)' + (1 — c)uge! used to cal- 
culate PageRank, where e is the unit vector. Size ng x ng. 

e S: Global system made up of multiple disjoint subsystems S = S$; U S2...U Sy, 
where N is the number of subsystems. 

e V: Global weight vector for system S, V=[v] v; ... vwl', where N is the 
number of subsystems. 


In the cases where there is only one possible system the subindex G will often be 
omitted. For the systems making up S we define disjoint systems in the following 
way. 


Definition 1 Two systems S$), Sz are disjoint if there are no paths from any nodes in 
S; to S or from any nodes in S$, to S}. 


Different versions of PageRank will be denoted as follows 
ROSH > Si, Sy > Sg... 


where t is the type of PageRank used, Sg C S is the nodes in the global system S for 
which R is the PageRank. Often Sg = S and we write it as Re: In the last part within 
brackets we write possible connections between otherwise disjoint subsystems in S, 
for example an arrow to the right means there are links from the left system to the 
right system. How many and what type of links however needs to be specified for 
every individual case. 

We will sometimes give the formula for a specific node j in this case it will be 
noted as RO [SH — $7, S; > Sx....]. When it is obvious which system to use (for 
example when only one is specified) and there are no connections between systems 
Sg as well as the brackets with connections between systems will usually be omitted 
resulting in Ry” . It should be obvious when this is the case. When normalizing the 
resulting elements such that their sum is equal to one we get the traditional normalized 
PageRank. 
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Definition 2 RY for system Sg is defined as the eigenvector with eigenvalue one 
to the matrix Mg = c(Ag + gcu,)! +(1—c)uge'. 


Note that M is a stochastic matrix and therefor PageRank itself can be seen as the sta- 
tionary distribution of a Markov chain describing a random walk on a graph described 
by A (with some correction for vertices with no outgoing edges) and a small random 
chance (1 — c) to move to a random node depending on the distribution described 
by u. By convention PageRank is normalized such that ||R“ ||; = 1 to get said sta- 
tionary distribution and it is usually reasonable to assume M to be irreducible and 
primitive hence Ri will be a positive vector easily shown using Perron-Frobenius 
theory for non-negative irreducible matrices [2, 8, 13]. The fact that ||R ||; = 1 
is generally not the case in our other version of PageRank. If we instead set up 
the resulting equation system and solving it we get the second definition, the result 
is multiplied with ng in order to get multiplication with the one vector in case of 
uniform Ug. 


Definition 3 RY for system Sg is defined as RY =(l- cAg) 'ngug 


We note that generally ||R®||; 4 1 as well as RY a ngRy unless there are no 
dangling nodes in the system. However the two versions of PageRank are proportional 
(2) d) 
to each other (Rg’ « Ra’). 
Another way to calculate PageRank using this second definition is from a prob- 
abilistic perspective. For a proof of the theorem we refer to our earlier work [7]. 


Theorem 1 Consider a random walk on a graph described by cA described as 
before. We walk to a new node with probability c and stop with probability 1 — c. 
PageRank R® of a node when using uniform u can be written: 


R® = > P(e; > ej) +1 (> (P(e; > a) (1) 


ei€S ei Fe; k=0 


where P(e; — e;) is the probability to hit node e; in a random walk starting in node 
e; described as above. This can be seen as the expected number of visits to e; if we 
do multiple random walks, starting in every node once. 


Two small graph-structures that will be used are the simple line and complete 
graph. 


Definition 4 A simple line is a graph with nz, nodes where node nz links to node 
nzi— which in turn links to node nz_» all the way until node nz link to node 1. 


Definition 5 A complete graph is a group of nodes in which all nodes in the group 
links to all other nodes in the group. 


The following well known lemma for blockwise inversion easily verified by cal- 
culating the matrix with the inverse according to the lemma [3, 7]. 
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Lemma 1 
7 (B— CED)! —(B—CE™'!D)"'!CE"! 


BC] 
E 4 7 eae: ID)! E!M4+E pe ce 3 ee) 
(2) 


where B, E is square and E, (B — CE~'D) are nonsingular. 


Below follows two previous results regarding PageRank for the simple line and 
complete graph by themselves, because of size considerations we refer to [7] for 
proofs of both theorems. 


Theorem 2 The PageRank of a node e; belonging to the line in a system containing 
a simple line with one outside node linking to one of the nodes in the line when using 
uniform weight vector u can be written: 


(2) nyt 1 — cuit! 
k=0 
tlt j>i 
=| ae (4) 


where nz is the number of nodes in the line and the new node link to node j. The new 
node has rank |. 


Theorem 3 The PageRank of the nodes in a complete graph with the first node 
linking out of the complete graph, the PageRank can be written as: 


(2) n(n—1)+nc 
nS n(n — 1) — (n— 1)c? — n(n — 2)c’ 6) 
R? = (c+n) (2-1) pete (6) 


~ nn —1)— @— De —nh— De’ 


where n is the number of nodes in the complete graph and node one links out of the 
complete graph. 


3 Changes in PageRank When Connecting the Simple Line 
with the Complete Graph 


Looking at some simple structures and how PageRank changes as we change them, 
the goal is to learn something in how and why the rank changes as it does. This in 
an attempt to answer questions such as: How do I connect my two sites or within my 
one site in such a way that I won’t get any undesired results? In all these examples 
we will assume uniform u (which means we can multiply the inverse (1 — cA')~! 
with the one vector in order to get R”). 
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Here we will look at what happens when we connect a complete graph with a 
simple line in various ways. This way we can get some information on what type of 
structure is most effective in getting a high PageRank and see how they interact with 
each other. 


3.1 Connecting the Simple Line with a Link from a Node 
in the Complete Graph to a Node in the Line 


Looking at the system where we let one node in a complete graph link to one node in 
a simple line we get a system similar to the case where we added a single node to the 
line (complete graph with one node). An example of what the system could look like 
can be seen in Fig. 1. We have the two systems Sz, Sg as the original systems for the 
simple line and complete graph respectively. We want to find the new PageRank of 
these nodes after creating our new system S by adding a link from the first node in the 
complete graph eg,; to node e, ; in the simple line. When using n, = 5,ng = 5, j = 3 
we get the system with (I — cA") seen in Fig. 1. 

Assuming uniform u the PageRank in the simple line after adding the link from 
the complete graph R” [Sg — S;,] can still be written in about the same way: 


Theorem 4 Observing the nodes in a system S made up of two systems, a simple 
line S;, with ny nodes and a complete graph Sg with ng nodes where we add one link 
from node eg in the complete graph to node e; in the simple line. Assuming uniform u 


we get the PageRank Ry) [Sg — Sz] for the nodes in the line after the new link and 
RG) [Sc — S,] for the nodes in the complete graph after the new link as: 


Fig. 1 Simple line with one 
link from a complete graph a n3 } na }é 5 


to one node in the line 
(rs) 
ees 
ar 
oe 
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ues L— cut! 
RM ISc > Si] = Dick + by = a + (7) 
k=0 
b= —cdt ox Very >i, 
| (ng — 1)c? + ng(ng — 2)c — ng(ng — 1)’ 
by =0, j <i. 


For the nodes in the complete graph we get: 


ng(ng = 1) + NGC 


(2) — 
Mae PE Ge eee eames 
SO tie (c+ ng) (ng — 1) i 


ng(ng — 1) — (ng — 1)c? — ng(ng — 2)c’ 


where R®, [Sg — Sz] is PageRank for the node in the complete graph linking to the 
line and RG [SG — S,] is the PageRank of the other nodes in the complete graph. 


Proof For the nodes in the complete graph we get the PageRank immediately from 
Theorem 3. 

For the nodes in the line we get a similar result as when adding a link from a 
single node to the line in Theorem 2. We get the same PageRank for the nodes we 
can not reach from the complete graph (bj = 0, j < i). For the nodes we can reach 
we need to modify b;;. The sum of all probability to reach the node in the complete 
graph linking to the line is found in (5) in Theorem 3. 


ng(ng — 1) +nge 
(ng — 1c? + ng(ng — 2)e — ng(ng — 1)” 


RGISc > Si] = 
The probability to reach the linked to node in the line e; is then 


Cc 
(=) RO. [Sc > Si], 
G 


and for any further node in the line we need to multiply with c for every extra step. 
This gives: 


ete —1 
bj =-—(-! c ; ng(ng ) + ngce (10) 
ng (ng — l)c* + ng(ng — 2)e — ng(ng — 1) 
— _(itl-i c+ (ng -1) (eu 


(ng — 1)c? + ng(ng — 2)¢ — ng(ng — 1)’ 


and the proof is complete. 


Another way to prove the theorem is by setting up the linear system and using 
the lemma for blockwise inversion (Lemma 1). If we want to know the common 
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normalized PageRank we find the normalizing constant as the sum of the PageRank 


of all the nodes: 


at c (1 - co) (c+ng—1) 
(1 —c) (1g — 1) c? +1g (ng — 2) ¢ — ng (Mg — 1) 
+ng (ng — 1) + nge(ng — 1) ce? + ng (ng — 2) e— ng (ng — 1) 
+ng ((ng — 1) c? + ng (ng — 1) — Ne + (ne (ng — 1))’) 


(c(ne— De +g (ne 2) c — ng (ng — 1) 


=1 
+ng ((ng — 1) c? +g (ng — 2) ¢ — ng (ng — 1D) i) ; 


which can be used to get the normalized PageRank: 


Ri? [Sc > Sz] = RY 1Sc > SiI/N. 


3.2 Connecting the Simple Line with a Complete Graph 
by Adding a Link from a Node in the Line 
to a Node in the Complete Graph 


(11) 


(12) 


When we instead let one node e; in the simple line link to one node in the complete 


graph we get a system that could look like the system in Fig. 2. 
For the PageRank we formulate the following: 


Theorem 5 Observing the nodes in a system S made up of two systems, a simple 
line S_ with nz nodes and a complete graph Sg with ng nodes where we add one 


Fig. 2. Simple line with one 
node in the line linking to a 7 n2 } n3 } m4} ns 


node in a complete graph 


Sere 
(mC) 
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link from node e; in the line to node eg in the complete graph. Assuming uniform u 
we get the PageRank Ry) [S;, — Sg] for the nodes in the line after the new link and 
RG ISL — Sg] for the nodes in the complete graph after the new link as: 


1= citl-i 


RIS, > Sol = aay (13) 


l-c 


_ pattl-—j —_ = = 
R® [S;, > Sol = (< c >) ( (ng — 1) — c(ng ss 1 


2(1 — c) ((ng — 1) — e(mg — 2)) 


Pee Ca! | a rn 
eal ae 2(1 —c) Ca =) a a 


(2) 1-c JN Ya cet 
REIS, + Sol = ——— + (5) ——_. ii. (16) 


Proof We divide the matrix (1 — cA‘) in blocks: 


where B is the part corresponding to the line, C is a zero matrix (since we have no 

links from nodes in the complete graph to the line). D is a zero matrix except for one 

element D, ; = —c/2, where e; is j:th the node in the line linking to the complete 

graph and e, is the g:th node in the graph linked to by node e;. We note that j, g 

are the internal number for the complete graph and line respectively and not their 

“number” in the combined graph. E is the part corresponding to the complete graph. 
In the same way we divide the inverse in blocks: 


Binv Cin 
Ty-1 _ 
(I = cA ) = E Em 


Using Lemma | for blockwise inversion we get: 
BY’ =(B—DE'C)’*=B", (17) 
Cc” = —(B— DE"'C)"'CE' =0, 


D’” = -E 'D(B— DE™'C)'=E'DB"’, 
E’ —E-'+E'D(B-DE'C)'cE'=E". 


Since one of the nodes in the line links out we get B divided in blocks: 


_ | Bg Be 
a Es | 
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Lee Osa0 
Dy tee ae 0 
01 -c 
Be=(6 0.4 “so | Be= 
0 
=¢ —c/2 0 ...0 


0...0 0 1 

where Bp is a zero matrix and Bg looks the same as Bg although possibly with a 

different size. The size of the blocks are: Bg : G — 1) x G—1),Bc: G—-1) x (@ — 

J+1),Bo: az —j+1) x G—1) and Be: (ap —j +1) x (nx —j + 1), where nz, 
is the total number of nodes in the line. 
For the blocks of the inverse we get: 


By = Bas (18) 

Bi” = —B,'BcB;’, 

Bh = 0, 

Be = EL. 
B?” and B”” are found as the inverse for the simple line, leaving BZ” to be computed. 
The only difference compared to a simple line is that the only non-zero element in 
Bc is —c/2 rather than —c. In other words B~! is exactly as it would have been for 


a simple line, except block corresponding to Ber which is multiplied with 0.5. 
We can now find the PageRank of the nodes in the line: 


RO [S_ > Sc] = Sm = Sm + Da (19) 
jri-l m-i ok j-i j-i n—j+l 
_ k Cc ope c l-c 
=Ye+ Y S-TS+(F) ice 


For the nodes in the complete graph we first need to find D’””, to do so we start by 
calculating DB~'. Since only one element D,; of D is non-zero, only row g of DB~! 
can be non-zero. We get row g as: 


(OB isu: = > I0 lee... C= zi 


where there are j — 1 zeros before the 1, (B~! upper triangular). Multiplying this 
with E~! gives: 
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0...0s cs... cs 
é 0...0s cs... cris 
—E'!DB!=—-1]0...0dcd...c™“d|, 
2 0...0s cs... cs 
0...0s cs... cs 
c (ng — 1) — c(ng — 2) 
s= , d= ; 
(ng — 1) — c(ng — 2) — c? (ng — 1) — c(ng — 2) — 2 


E~! was calculated in one of our previous works [7, Lemma 2, p. 14-17] where we 
considered the problem of a complete graph without any additional connections, E~! 
can be seen at the end of this proof. 

We can now find the PageRank of the nodes in the complete graph by summation 
of corresponding row: 


AL ne nL 
. . 1 
ROS > Sc] = >) D+ DL BR = > De + — (20) 
k=1 k=1 k=1 


We separate between the node in the complete graph linked to from the line and 
the other nodes in the complete graph. 


ct 
ROS, > Se] = oS bs ¢ (21) 
= 0 


_ cl — ct!) Cc 1 . 
=( 201-0) V(= 1) — eg —2) 3) + rinks, 


pe =] 


RO,[S, > Sc] = pe c 


—{oder =) ((t¢ — 1) — c(te — ) fl I 

_ 201 —=2) ((ng — 1) — c(ng — 2)) — 2 l=¢ 
And the proof is complete. For completeness we include the complete inverse as 
well: 


(22) 


Lee 2 ee 

O1- ee? 
(i= cA')! = Ee cm , BY )00 1 , 
= gfe 
00... 0 1 
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OOOAMOOO 
§ ° 


Fig. 3 Simple line with one node in the line being a part of a complete graph 


0...0 
Coches ee hg D-'! = —E~'DB™!(seen above), 
0...0 
1 — (ng — 1)ak k k — k 
k 1 — (ng — 1)ak k bus k 
El= k k 1 — (ng — lak *: ; 
: ‘ es : k 
k k re k 1—(ng — l)ak 


—a 


~ d—-a@+(n—2)(a—a@)) 


a=-—c/(n—1), k 


This could also have been showed using a similar method as the one used to 
prove Theorem 4. The normalizing constant can then be found by summation of the 
individual PageRank of all the nodes in order to get the normalized PageRank R“, 

We note that while the node in the line that links to the complete graph does not 
lose anything from the new link, the nodes below it in the line do lose quite a lot 
because of it. Likewise the PageRank of the node thats linked to gains more from 
the new link than the others in the complete graph. 


3.3. Connecting the Simple Line with a Complete Graph 
by Letting One Node in the Line Be Part 
of the Complete Graph 


If we instead let one node in the line be part of the complete graph we get another 
interesting example to look at. An example of what the system could look like can 
be seen in Fig. 3. 

We formulate the following theorem for the PageRank of the given example: 
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Theorem 6 The PageRank of the nodes in system S;, made up of a simple line and 
system Sg made up of a complete graph after one of the nodes e; € S, becomes part 
of the complete graph assuming uniform u can be written: 


7 1 = citi 
ROIS, @ Sc] = ; , EE (23) 
: = Cc 
L— cut c(ng — 1) 

R? 1S, > Sg] = 2 24 
Eine Pal l-e  @e—D)—cng—D a 
( ng((ng — 1) — e(ng — 2)) ) 

ng((ng — 1) — c(ng — 2)) — 2 (ng — 1) J’ 
eR toe 
ROIS, < Sc] = Li a ee (25) 
: ng l-c 
ctng) (ng—-1) A -0c) + (ng — D2 — ch 
R®[S, o Se] = 5 c) (ng — 1) ( ) + (ng — I)c*( ) (26) 


© (l= e)(ng(ng — 1) = (ng = Ie? = ng(ng = 2)e)’ 


where RG ISL <> Sg] is the PageRank for the nodes in the complete graph (except 


the node also being a part of the line) and Re [S_ <> Sg] is the PageRank of nodes 
in the line. ng, nz is the number of nodes in the complete graph and simple line 
respectively after making one node in the line part of the complete graph. 


Proof For the proof of the nodes e; € S;, i > j we get the PageRank for a simple 
line. In order to find RY? [S; <> Sg] we first use Theorem | to write it as: 


ROIS, o Scl=({ 0 P(e > ej) +1 (Seon), (27) 


erES ej; Fe; k=0 


where P(e; — e;) is the probability of getting from node e; to node e;. 


mrl=j i 0° big =) k 
> P@>retl= >) &+@-1) > ( ; ) (28) 
eS, ce; k=0 ne— li \ @e- VD 
te cutli c(ng — 1) 


l-c (ng — 1) — c(ng — 2)’ 
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_ c(ng—1) ¢ de c(ng — 1) c(ng-—2) c¢ 


Ple; Ae 2 
(6 ° ¢)) NG Ng — 1 NG Ng — 1 Ng — 1 ( 2 
c(ng — 1) 2(ng—2)? ¢ 
NG (ng — 1)? ng—1 _ 
= > (cs - =) 7 c?(ng — 1) 
he n=l J ~ nelGig=1)—ehg—2)) 
< ng((ng — 1) — e(ng = 2)) 
(P(e; > &))k = os S - (30) 
2d : / ng((ng — 1) — c(ng — 2)) — 2 (Hg — 1) 
Multiplication of the results from Eqs. (29) and (30) gives 
L— cut c(ng — 1) 
Ree ( I-¢ ' @—D ey, _ 


( ngo((g — 1) — c(mg — 2)) ) 
ng((ne — 1) — c(ng — 2)) -— 7g - 1} 


For the nodes below the one in the complete graph we can write the PageRank as: 


j-i-l 


k=0 
fol 
P(e; > ej) = (33) 
NG 
This gives: 


JR SL @ Se] 1c 


NG l-c 


RIS, & Sg] = wi<j. (34) 


Left to prove we have the formula for the nodes in the complete graph not directly 
connected to the line ROS 'L <> Sc]. We do not need to consider the part of the line 
following the complete graph, since we can not get from this part of the graph back 
to the complete graph. We already have the PageRank for the nodes not linking out 
in the complete graph in the case where we have no line of nodes linking to the 
complete graph from Theorem 3. 

Since all paths P(e, — e;) where e, € S; need to go through node e; we can write 
these as a product of the probability to get to node e; times the probability to from 
there get to node e; for which we want to calculate PageRank. 
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RGIS > Sol = (> (P(e: > a) (35) 


k=0 


Dd) Plex > e) +1] Pej edtlt+ So Pe > ei) 


ee ey ee eSG 
Cn Fej Cn Fe; ei 


Looking at the part depending on e; we get 
oe) 
c(ng — 1) 
PG; i P(e; Dd) }= : 
aad o(X ea ) na(ng — 1) — nang — 2)¢ — a — DE 
(36) 
This can be either in the matrix proof of Theorem 3 in [7] or by calculating 


corresponding hitting probabilities. Using this we can decompose the PageRank into 
two parts 


ROS & Scsl=+]1+ >) Pla ei) (Se: oo) (37) 


ee eSG k=0 


Cn FE; Ci 


>) Plex > ei) +1] P(e > ei) (> (P(e; > 2) 


eneSy k=0 


Cx Fe; 


Decomposing the PageRank for a complete graph with links out and the PageRank 
for a simple line for the part corresponding to the line we get 


(ng — 1)(nG + ¢) 
Beles o8el i gaa) = ede Ge 
z (“ - —) c(ng — 1) 
l—-c ng(ng — 1) — ng(ng — 2)e — (ng — 1)e? 
_ (€+ng) ig — 1) -— 0) + Me - Ner(1 — cc) 
~ (L=o)(g(tg = 1) = (ng = 1)c? = ng(ng — 2)e)" 


(38) 


Theorem 7 The normalizing constant N for the simple line with one node being 
part of a complete graph using uniform wu can be written as: 


264 C. Engstrém and S. Silvestrov 


- (2) (2) n—1 
= (ne — DW RGhgrlSt > Sal + RESt > Sa] + =— (39) 
e(l1-c™7)  e(1-e!) cle) REIS. <> Sel 


(1—c)? (l1—c) ng(1—c) 


where: 

° ROS, <> Sg] is the PageRank of nodes in the complete graph, 

° RY) [S; <> Sg] is for the node in both the line and complete graph, and 

e R”? [S_ <> Sg] is for the nodes in the line. 

Proof The normalizing constant is equal to the sum of the non-normalized PageRank 
of all nodes. 


We got ng nodes in the complete graph, (n — 1) not directly connected to the line 
and one connected to the line. This gives: 


N = (n— DRGMSt > Sc] + RPSL > Sel + DORE S, Sel, (40) 
iAj 


where RPS 'L <> Sc] is the PageRank of individual nodes in the line except for the 


node node j in the line for which we have RY) [S_, <> Sg]. For those nodes we got 
PageRank: 


_ citi 
i , >, 
RS, oe SGl=4 = Al 
Ll L <> Sel oR? pe a (41) 
Sa , i<j. 
ng l-c 
The sum of all nodes for which i > j can be written: 
nL . ny —j 
np-j ce(l—-c™d 
> ROS © Se] = = ( ) (42) 


i=jtl exe dao 


nm —c a 


‘isa geometric sum. Calculating the 


where we use that the second part Das 
sum for i < j in the same way we get: 


j-l 
SIRES. > Sel = 
i=1 


j-1_e(i-e!) | c(t -e")RE 
l-c Ge ng 1 —c) 


(43) 


Summation of all individual parts completes the proof. 


Now that we have an explicit formula for this example we can look at what happens 
when we change various parameters like c or the size of either the line or complete 
graph. 
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4 A Closer Look at the Formulas for PageRank 
in Our Examples 


Now that we have formulas for the PageRank of a couple different graph structures 
we are going to take a short look at what happens when we change some parameters. 
We will also take a look at the partial derivative with respect to c. 


4.1 Partial Derivatives with Respect to C 


In the case of the simple line with formula as seen earlier we get the derivative with 
respect to c as: 


2 R%S,1= 0-07 
dc 


ci ttl (ny S55 tt 1) ctl 
c(l-c) (l—c) 


(44) 


Rewriting it and looking to see if it is positive we get: 


NO 50 Gm +=) =O, (45) 
ai z A cui 


[o,@) 
S ie > -)-c) > i = (my — Hom ere > (ny — ic". (46) 


Since we have 0 < c < 1, nz > iwehave that ck > c+! the first nz, — i elements 
of the left sum is at least as large as c”“~", this gives: 


CO 
Vide Vik =a -de™. (47) 


For our case with a line connected to a complete graph by letting one node in the 
complete graph be part of the line we get the following derivative with respect to c: 


a 
5cRe [Sp <> Sg] (48) 


(((-1 + c)ng? + (-1+0¢)* ng — c?) (-1 +. 0) ng — 2c +1) EG (c)) ng 
((-1 +e) ng? + (-1 +6)’ ng — 2)’ 

((ng — 1) (ce (-2 +e) ng + 2 — 2c) G (©) — (ng — 1) (ng + ¢?))) ng 
((—1 + 0) ng? + (-1 + 0)? ng — 2)’ 


’ 
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1= cut 
G(c) = —_—_., 
l-c 
5 cuit (nz =f zig 1) chtl 
—G(c) = (1-0)? 
Ic (c) ( Cc) C al -_ c) al _ oP 


The derivatives have about the same shape as the original function. As c gets large 
so does the derivative and as ng increases the slope gets steeper for large c. 

Looking at the other nodes in the complete graph we get the derivative with respect 
to c as: 


3 
ac 
aed 6 = Cane Ge — 12 tine te (1 = cul) 

~ ( — 0) (ng (ng — I) — (ng — D2 — ng (ng — De) 

(ng — cl" (ny -j) 

(1 —c) (ng (ng — 1)— (ng 1) c? ng (ng 2)c) 
(c+ng) (ng - 1) (1-0) + (ng = Ne? (1-7) 
Tae (ng (ng — 1) — (ng — 1)c? — ng (ng — 2) ¢) 
((c tnt =DO=—pstie= 1 (1 Ze c-I)) Qe+@=2c=—ng)ng) 


(1 —c) (ng (ng — 1) — (ng — 1) 2 = ng (ng - 2) c)* 


RG ISL Sel (49) 


4.1.1 Changes in the Size of the Complete Graph for Our Last Example 


When we change the size of the complete graph we can see for example what size 
would be the most effective for increasing ones PageRank. In all these examples we 
will use nz = 10,7 = 6, c = 0.85 and ng will wary between | and 50. First we note 
that the part above the complete graph is unaffected by the change of ng. It is obvious 
however that as ng increases the normalizing constant in the normalized PageRank 
will likely get larger resulting in a lower PageRank as long as it is part of a small 
system. 

For the nodes in the complete graph we get the result in Fig. 4. 

Here we see something curious, the nodes seems to be gaining rank in the begin- 
ning while starting to fall after a while and possibly converging towards a single 
value depending on if it links out of the complete graph or not. The reason we get 
a local maximum is the fact that for a moderately large ng we maximize the prob- 
ability of RY) getting back to itself while keeping the complete graph lare enough 
to keep most probability for itself. Here we can see that it is not always a good idea 
for an individual node to join a complete graph. If the node in question already have 
larger PageRank than the other nodes in the complete graph it actually might lose 
PageRank from joining it. 
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Fig. 4 R® of the nodes in the complete graph not part of the line (Jeft) and part of the line (right) 
as a function of ng 


The result for the node below the complete graph we get the result in Fig. 5. 

Here we see the great loser as ng increases. Since the chance of escaping the 
complete graph depends on KC jlSz < Se]/ne as ng increases so does this nodes 
PageRank as well. From this we see a clear example of the effects of complete graphs 
on its surrounding nodes. A complete graph can be seen as a type of sink, all links to 
the complete graph will be used to maximum effect within the complete graph. And 
even worse, even if the complete graph have some nodes that point out of it their 
influence will be very small since the nodes in the complete graph having a large 
number of links the chance of escaping is low. 


Fig. 5 R®) of the nodes in 
the line below the complete 
graph as a function of ng 


2.55 


268 C. Engstrém and S. Silvestrov 


5 A Look at the Normalized PageRank for the Line 
Connected with a Complete Graph 


Looking at the normalized PageRank in our last example with a simple line with one 
node being part of a complete graph we want to see how the PageRank changes as c 
or the relation between the size of the line or complete graph changes. 


5.1 Dependence on C 


Plotting the PageRank with ng = 10, nz = 10,7 = 6andc € [0.01, 0.99] we get the 
following results. For the nodes in the line above and part of the complete graph 
RS) [Sz <> Sc], i = 7 we get the result in Fig. 6. 

Looking at the results for the nodes above the complete graph we see that the func- 
tion seems to have a max at about c = 0.55 after which it decreases faster the closer 
to c = | it gets. Looking at the node part of the complete graph we see the great 
“winner” as c increases. Do note the difference in the axis for the different images, 
since this at its lowest point is actually about the same as the highest for the node 
above the complete graph. 

We find the c which maximize the function for some other different parameters 
Ng, nz, j, iin Table 1. All the local max/min is calculated using the optimization tool 
in Maple 15. For the nodes above the complete graph the location of the maximum 
seems to be moving towards the left as ng increases and towards the right as nz, 
increases. In the same manner it moves towards the left as i get closer to nz. The 
value of the maximum is only included out of completeness, it is natural that they 
decrease as either ng or nz increases as we in those cases get a larger number of total 
nodes in the system. It is interesting to note that the max seems to be going towards 
the right as both ng, nz increases as well. 


00+ 


0.07 + 


0.06 


Fig. 6 R of the node above the complete graph (/eft) and of the node in the line being a part of 
the complete graph (right) as a function of c 
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Table 1 Maximum PageRank R") of node i “above” the complete graph and node j depending on 
c for various changes in the graph where one node in a simple line is part of a complete graph 


NG AL J i arg, max ER” max ie arg, max Ri” max Ry” 


10 10 
10 10 
10 10 


10 0.000 0.053 1 0.091 
0.515 0.054 0.893 0.107 
9 0.300 0.053 - - 


DI WLOL ADI DI DIDI AA 


For the node in the line that is part of the complete graph the PageRank of this 
node is the largest when c is large, sometimes with a local maximum and sometimes 
not. It seems to be that as the number of nodes in the complete graph increases we 
are more likely to find a local maximum than not. 

For the node just below the complete graph as well as the nodes in the complete 
graph not part of the line we get the results in Fig. 7. 

PageRank of the node below the complete graph decreases as c increases, but 
compared to the nodes above the complete graph not as fast for large c. This since 
the PageRank of the nodes in the complete graph increase so fast for large c that even 
the comparatively small influence it have on the nodes out of it is enough to at least 
stop the extremely rapid loss of rank as for the nodes above the complete graph. 

As with the node in both the line and complete graph, PageRank of the other 
nodes in the complete graph increases very fast for large c. We once again see a hint 
to why a to large c could be problematic, it is for large c we get the largest relative 


0.054 0.097 


0.1 02 03 04 05 0.6 0.7 08 09 0.1 0.2 03 04 0.5 06 0.7 08 0.9 
c c 


Fig. 7 R) of the node below the complete graph (eft) and of a node in the complete graph as a 
function of c 
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changes in PageRank between nodes. We have no min/max here, instead PageRank 
increases faster and faster the larger c gets. 

We note that these local maximum and minimum are not always present. In these 
cases we have a PageRank thats decreasing as c increases for the whole interval. If 
the one exist we can expect the other to as well (since we expect the rank to decrease 
at the end of the interval). It is hard to say anything conclusive about the location 
or existence of local maximum or minimum points, but we do note that they exist. 
There is also a large difference in how PageRank changes for different (especially 
large) c, we can therefor expect c to have an effect not only in the final rank and the 
computational aspect, but also the final ranking order of pages. 


5.2. A Look at the Partial Derivatives with Respect to C 


Since we have the formulas for the normalized PageRank it is also possible to find 
the partial derivatives. Since the partial derivatives result in very large expressions 
(multiple pages each) they are not included here. By setting ng = nz = 10, j = 6 we 
get the result after taking the partial derivative with respect to c for 0.05 < c < 0.95 
for the node e;,7 above the complete graph as well as the node in the line part of the 
complete graph in Fig. 8. 

We see the derivative falling faster as c increases. Here as well we see the more 
dramatic changes in large c above about 0.8. Apart from seeing the maximum at 
around c = 0.3 in the original function we can also see that the derivative seems to 
briefly increase in the beginning, reaching a maximum at about c = 0.1. For the node 
part of both the line and the complete graph we get the result in Fig. 8. 

We can see a high derivative all the way until we get to very large c where it 
finally starts going down. We can clearly see the maximum at about c ~ 0.9 in the 
original function. For the node on the line below the complete graph we get the result 
in Fig.9. 


0.1 0.2 03 THOS 0.6 0.7 0.8 0.9 


0.1 0.2 03 04 0.5 0.6 0.7 0.8 00 


Fig. 8 Partial derivative with respect to c of normalized PageRank of the node in the line above 
the complete graph (/eft) and of the node in the line part of the complete graph (right) 
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0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 
c 


0.05 4 


0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 


0.065 Cc 


Fig. 9 Partial derivative with respect to c of normalized PageRank of the node in the line below 
the complete graph (/eft) and of a node in the complete graph not part of the line (right) 


Although the derivative is decreasing for all c, the derivative have a local maximum 
at about c ~ 0.6. Worth to note is that the axis can be a little misleading, the partial 
derivative is in fact not that close to 0 at the local maximum. As before the largest 
changes are at high c. Worth to note that the derivative is decreasing for all c. For the 
nodes in the complete graph not part of the line we get the result found in Fig. 9. 

As before the largest changes are found at large c. Compared to the node part of 
both the complete graph and the line the derivative for the ones only in the complete 
graph continue to increase as c increases, however the PageRank itself is not actually 
ever higher for the ones not part of the line. We have seen that although it is possible 
to find symbolic expressions for the PageRank and derivative for some simple graphs, 
as the complexity of the graph increases it becomes very hard to do. Already for these 
simple examples the partial derivatives a rather large and complicated expressions. 
Finding more general symbolic expressions for when the derivative is zero should 
be possible although problematic given the constraints and size of the problem. 


5.3. A Comparison of Normalized and Non-normalized 
PageRank 


Here we will take a short look at the difference between normalized (R“) and non 
normalized (R) PageRank in order to get a bigger understanding of the differences 
between them. We already know that R* « R") so there will always be the same 
relation between the PageRank of two nodes. Here we will take a look at how the 
absolute difference between nodes and the two types of PageRank differ instead. 
Since the PageRank is normalized to one in R” we obviously get that the PageR- 
ank will decrease as the number of nodes increases, potentially making for problems 
with number-representation for extremely large graphs unless it is taken into account 
when making the implementation. This problem is not as large a problem for R® 
since most nodes will have approximately the same size regardless of the size of the 
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graph. However the possible huge relative difference between nodes is still needed 
to take into consideration. We note however that with the current way to calculate 
R® by solving the equation system such large systems that could potentially be a 
problem in R“) is simply to large for us to solve in a timely manner. 

We also have one other main difference between the normalized and non- 
normalized PageRank and that is with dangling nodes and how they effect the global 
PageRank. In R™ a dangling nodes means some of the “probability” escape the graph 
resulting in a lower total PageRank (but still proportional to R“). In R however 
dangling nodes can be seen as linking to all nodes and in fact behaves exactly as 
if they did. We illustrate the difference in a rather extreme example with a graph 
composed of only four dangling nodes as well as a complete graph composed of four 
nodes. 

An image of the systems can be seen in Fig. 10 below. When computing R“ of 
both systems assuming uniform weight vector u they are both obviously equal with 
PageRank R“ = [1/4, 1/4, 1/4, 1/4], it does not even matter what c we chose as 
long as it is between zero and one for convergence. However for the non normal- 
ized PageRank we get a large difference between the PageRank of the two systems 
where we for the complete graph get the PageRank R?) = [1/1 —c, 1/1 —c, 1/1 
c, 1/1 — c] as seen in [7]. However for the graph made up of only dangling nodes we 
get the PageRank R” = [1, 1, 1, 1] regardless of c. We see that while they might 
be proportional to each other, the non normalized version behaves differently for 
dangling nodes making a distinction between dangling nodes and nodes that link to 
all nodes (including itself which we normally do not allow). While this distinction 
might seem unnecessary since nodes that link to all nodes do not normally exist or 
similar nodes such as a node that links to all or most other nodes should either be 
extremely uncommon or plain do not exist as well, this might not be the case if work- 
ing with smaller link structures where such a distinction might be useful. It is also 
this distinction that makes it possible to make comparisons of PageRank between 
different systems in R® while not generally possible in R. 


6 Conclusions 


We have seen that we can solve the resulting equation system instead of using the 
definition directly or using an iterative method such as the Power method. While this 
method is significantly slower it has made it possible to get a bigger understanding 
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of the different roles of the link matrix A and how the parameter c influence the 
ranking. We have seen how PageRank changes when connecting two systems: a line 
of nodes and a complete graph in various ways. For these systems we also found 
explicit expressions for the PageRank and showed two different ways to find these. 
Either by solving the equation system itself or by calculating: 


Dd) Pei > eg) +1) | D5 Pee > e))* J, 


eieS, e;#eg k=0 


where P(e; — é,) is the sum of probability of all paths from node e; to node eg and 
the weight vector u is uniform. 

Given the expressions for PageRank we looked at the results when changing 
some parameters. While it is hard to say anything specific, two things seem to be 
true overall: The most dramatic changes happens as c get large, usually somewhere 
where c > 0.8 some nodes get dramatically larger PageRank compared to the other. 
We also see that complete graphs, while not gaining a larger rank if the graph is 
larger, it becomes a lot more reliable (as in not as effected in changes of individual 
nodes) in keeping its large PageRank as the structure get larger. 

We saw that if using uniform V it is possible to split a large system S into multiple 
disjoint systems S;, S2,..., Sy it is possible to calculate R® for every subsystem 
itself and they will not differ from R) apart from a normalizing constant that is 
the same across all subsystems. This is a property we would like to if possible have 
when using the power method as well. This since it could potentially greatly reduce 
the work needed primary when doing updates in the system. 
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Graph Centrality Based Prediction 
of Cancer Genes 


Holger Weishaupt, Patrik Johansson, Christopher Engstrém, Sven Nelander, 
Sergei Silvestrov and Fredrik J. Swartling 


Abstract Current cancer therapies including surgery, radiotherapy and 
chemotherapy are often plagued by high failure rates. Designing more targeted 
and personalized treatment strategies requires a detailed understanding of druggable 
tumor driver genes. As a consequence, the detection of cancer driver genes has 
evolved to a critical scientific field integrating both high-throughput experimental 
screens as well as computational and statistical strategies. Among such approaches, 
network based prediction tools have recently been accentuated and received major 
focus due to their potential to model various aspects of the role of cancer genes in 
a biological system. In this chapter, we focus on how graph centralities obtained 
from biological networks have been used to predict cancer genes. Specifically, we 
start by discussing the current problems in cancer therapy and the reasoning behind 
using network based cancer gene prediction, followed by an outline of biological 
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networks, their generation and properties. Finally, we review major concepts, recent 
results as well as future challenges regarding the use of graph centralities in cancer 
gene prediction. 


Keywords Biological networks - Graph centrality - Disease genes + Gene prioriti- 
zation 


1 Introduction 


Efforts towards understanding and treating cancer have received major research focus 
for many decades. However, despite tremendous progress made during this time, 
mortality rates among cancer patients still remain high [149], implicating cancer as 
one of the leading causes for human deaths [7]. 

The recent technological advancements that facilitate high-throughput simulta- 
neous measurements of thousands of biological entities have now given researchers 
access to an even more detailed insight into the mechanisms underlying cancer. 
While such data have enabled the identification of numerous genes and pathways 
mis-regulated in and potentially causing cancer, related analyses at this resolution 
have also demonstrated that cancer is much more diverse and complex than what 
was initially expected. 

Specifically, transcriptional and epigenetic studies have revealed that individual 
tumor types can manifest in a multitude of different molecular appearances, also 
referred to as subtypes or subgroups [61, 153, 159, 165, 174]. While it was often 
assumed that each patient could be assigned to one unique such subgroup, recent 
studies in the malignant brain tumor Glioblastoma have further demonstrated that 
different subgroups of tumors might coexist in different regions [154] or even inter- 
variably change from cell to cell [120] in the same patient. Additionally, the bulk 
tumor might constitute numerous different cancer cell clones [110, 154], each of 
which in turn could entail a hierarchy of cancer related progeny cells [109], as well 
as other cell types from the tumor micro-environment [10]. In summary, cancer has 
presented itself as a rather heterogeneous disease not only at an inter- but also at an 
intratumoral level (compare also reviews [109, 197]). 

While the presence of different tumor subtypes with varying clinical prognoses 
suggests a need for more personalized therapies on one hand [32, 36], the afore- 
mentioned intratumor heterogeneity on the other hand presents one of the greatest 
obstacles towards the development of any successful treatment option. Specifically, 
considering the high degree of cellular diversity in the tumor mass, it is not only 
difficult to determine the individual tumor driving cells, but also to predict tumor 
plasticity due to sub-clonal interactions and dynamics upon targeted treatment [92, 
105, 108]. A recent study in the malignant childhood brain tumor Medulloblastoma 
has for instance demonstrated that between diagnosis and relapse there is often a very 
low agreement between genomic events and thus likely also dominant clones [115]. 
Coupled to a persisting lack of understanding of what drives the abnormal growth 
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of many such cancer cells, it remains still a challenge to design drug treatments that 
can effectively eliminate specific let alone all cancer cells in a patient. Ultimately, 
it is assumed that cancer treatments currently often face the problem that only part 
of the tumor bulk will be removed, while the remaining cells either inherently or 
through selective pressure acquiring resistance will survive treatment and ultimately 
constitute the tumor relapse [105, 108]. 

Towards overcoming these failures in current therapy, improved strategies will be 
required, which likely comprise combination drug treatments alone or in connection 
with other treatment options [39, 92, 115, 200]. Implementing such strategies in turn 
requires a better knowledge about the drug targetable cancer genes that enable cells 
to drive the tumor development, metastatic dissemination and to facilitate treatment 
resistances. 

Given the ease of access to high-throughput ‘omic’s data, cancer related genes 
can nowadays be predicted in a variety of different ways, the preferred alternative of 
which is often the direct determination of genomic abnormalities in cancer patients 
using sequencing or microarray based platforms. In particular, a related systematic 
approach that has rapidly grown in importance during the last decade is the genome 
wide association study (GWAS), in which the frequency of genetic variants, often in 
the form of single-nucleotide polymorphisms (SNPs), are investigated in patient and 
control cohorts to identify associations between phenotypes and candidate genes [66]. 
As aresult of such efforts, more than 10,000 of such associations have been registered 
so far [124]. 

However, the direct detection of candidate genes as exemplified by targeted re- 
sequencing or GWAS is often dependent on large cohort size for the detection of 
genetic variants as well as the validation of significant associations [66]. For rarer 
diseases or traits such a strategy might hence not be possible without extensive pool- 
ing of international biobanks. In those cases however, one might fall back on patient 
derived cell lines or animal models when available, in which one can then exploit sev- 
eral screens for candidate gene identification. Specifically, forward genetic screens 
attempt to discover the genetic event causing a given phenotype, e.g. a transforma- 
tion causing tumor initiation, progression or treatment resistance, by introducing a 
multitude of random sequence variations [114], using for instance transposon based 
mutagenesis [168] or retroviral mutagenesis [169]. Reverse genetic screens on the 
other hand are designed to determine the phenotype caused by a given gene alteration 
through targeted modification of the gene’s function [64], for instance trough RNA 
interference [150] or use of the CRISPR-Cas system [138]. 

Nevertheless, despite recent advances in the detection of disease specific genetic 
variants, it has also become clear that oftentimes driver events might not be readily 
distinguishable. Rather, many diseases have been shown to be polygenic [131, 184], 
i.e. exhibiting a large number of variations with only modest effects on susceptibility 
with the disease phenotype being shaped by interactions or complementary effects 
of the respective loci. For cancer, the identification of dominant driver genes is fur- 
ther complicated by genetic heterogeneity owing to genomic stability, which causes 
increased rates of mutations and structural aberrations [26, 50]. Specifically, while 
the determination of genomic landscapes in various tumors has produced a number 


278 H. Weishaupt et al. 


of recurrently mutated gene loci, they have also demonstrated the genomic diversity 
within cancer types with the presence of many infrequent events among patients and 
multiple distinct events within patients [82, 176]. To identify the disease-driving 
mechanism among the mass of low penetrance variants, it has been suggested to 
study the over-representation of functional pathways [128, 181] among the affected 
genes. However, as disease development and subsequent phenotypes are likely being 
shaped also by other factors, such as signals from the cellular microenvironment or 
life style and environmental influences, driving mechanisms might need to be studied 
in an even greater context of entire biological systems and molecular networks [128, 
141, 143, 175]. 

Importantly, biological networks not only present a highly adaptive means of 
illustrating the various aspects of intrinsic relationships and interactions of biological 
entities captured in diverse experimental data sets, but they also provide a power- 
ful basis for mathematical and bioinformatic studies of the underlying topological 
structures (compare reviews by [2, 100, 175, 199, 199, 204]). Specifically, during 
the last decade a number of graph theory based methods where suggested for the 
prediction of disease genes from various types of molecular networks, as reviewed 
in [13, 185, 189]. 

In this chapter we will review some techniques and aspects of network based gene 
selection with a focus on graph centrality related methods, how such methods have 
been employed in cancer and disease gene screens and point out some challenges 
and future perspectives pertaining to such an approach. For the remainder of this text, 
depending on the context and source, we will include results and conceptions from 
both cancer network analysis as well as other disease network analyses. However, 
for the purpose of cancer gene identification, these two terms can here be considered 
equivalent. 


2 Biological Networks 


During the last decades, networks have gradually evolved to one of the most important 
tools in systems biology. The ubiquitous use of networks in biological science relies 
on the fact that they can be designed to model a great variety of relationships under- 
lying biological processes, including for instance direct physical interactions, causal 
dependencies, functional relatedness, as well as the co-regulation or cooperation of 
molecular entities. As a consequence, networks represent one of the most promis- 
ing resources for studying and understanding the complex dynamics of biological 
systems. Specifically, it has been suggested that, while more and more individual 
genomic variation are being linked to specific diseases, the actual disease pheno- 
types might be a result of large-scale perturbations of the entire biological system 
[141, 175]. Hence, the characterization of the underlying biological networks will be 
a crucial step in unraveling the intricate connections between genetic events, system 
wide alterations and disease outcomes. 
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2.1 Network Types 


Specific relational data has been gathered from smaller scale studies throughout the 
world for many decades now. In addition, the recent dawn of modern high-throughput 
analysis techniques has further facilitated affordable and fast large-scale screens of 
related biological data. Together, these resources have allowed the establishment of 
networks in various biological fields and modeling a multitude of different biological 
relationships. The major types of biological networks spawned by these advances 
are listed below. 


Gene co-expression networks attempt to model how genes are regulated together in 
certain signaling pathways by displaying genes as nodes and linking them by an edge 
if they show a correlation of expression values over a set of conditions or tissues [28, 
158, 202]. 


Genetic interaction networks (GIN), which depict the dependency of genomic 
variations in causing certain phenotypes by modeling genes as nodes and linking 
them by an edge if a simultaneous alteration of both is required to obtain a given 
biological outcome, such as lethality [166, 167]. 


Insertion site interaction networks for forward genetic screens, in which nodes 
represent genomic loci and/or genes that are targeted by the integration of a trans- 
poson or virus into the host genome and edges depict some type of relationship 
between these sites. Specifically, insertion sites might be linked to each other and to 
proximal genes, if the genomic distance between these loci is below some threshold 
thus identifying regions of highly clustered integration sites, which are commonly 
referred to as common integration sites (CIS) [49]. Alternatively, nodes could also 
depict the gene targets of a CIS and edges might depict the co-occurrence or mutual 
exclusivity of these CISs in a given sample of cells [88, 103, 170]. 


Metabolic networks, which can model biochemical processes as relationships 
between enzymes, substrates and/or metabolic reactions in various ways based on 
the specific choice of node and edge representations [74, 76, 179]. 


Pan-disease networks, in which nodes represent genes and edges connect genes that 
have been implicated in the same disease [55]. 


Protein domain networks, with nodes representing protein domains and a link indi- 
cating the presence of the two connected domains in the same protein 
[190, 191]. 


Protein phosphorilation networks, attempt to model the protein phosphorylation 
cascades in a cell by modeling proteins as nodes with directed edges indicating 
a phosphorylation of the target protein (substrate) catalyzed by the source protein 
(kinase) [125]. 
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Protein-protein interaction networks (PPIN), in which nodes are representing 
proteins and indirect edges are indicating a direct physical binding of two proteins [52, 
73, 93, 137, 196]. 


Protein-RNA binding networks, which depict the physical interaction of RNA- 
binding proteins with their target RNAs [8, 93]. 


RNA-RNA interaction networks aim to capture transcriptional regulatory interac- 
tions similar to the TRNs but instead focus on the posttranscriptional regulation of 
micro RNAs (miRNAs) or non-coding RNAs (ncRNA) and their interaction with 
each other or with other RNA components of the gene regulatory machinery [71, 
93]. 


Transcriptional/gene regulatory networks (TRN/GRN), with nodes representing 
genes and directed edges indicating a regulatory relationship, in which the protein 
product of the first gene, usually a transcription factor (TF), binds toa DNA regulatory 
region of the second gene to affect a transcriptional activation or repression [9, 60, 
130, 139]. 


Functional Association/Linkage networks (FAN/FLN), in which nodes represent 
genes or gene products and edges indicate a potential functional similarity, which 
is determined based on association evidence usually gathered from an integrated 
collection of other types of biological data [67, 97, 98]. 


2.2. Generation of Biological Networks 


The interactions represented in biological networks should preferably be derived 
from curated and validated experimental data in order to ensure biological soundness 
of the modeled relationships. For instance, there exists a plethora of methods for 
the targeted study of individual protein-protein interactions [111, 121]. However, 
while such traditional methods have produced valuable collections of high-quality 
interaction data during the last decades, their low-throughput nature often makes 
them time consuming and expensive to perform for thousands of biological entities. 

On the other hand, the onset of high-throughput technologies has opened up 
another route to predict biological interactions through large-scale experimental 
screens. For instance, transcription factor binding sites can be experimentally stud- 
ied through chromatin immunoprecipitation (ChIP) analyses [42, 173], while tran- 
scriptional regulatory interactions can be estimated by combining ChIP and gene 
expression data alone [126, 183] or in combination with other genomic data [182]. 
For the detection of PPIs a multitude of different high-throughput methods have 
been discussed [15]. In addition, with the increasing amount of high-throughput 
data made available, there have also been substantial advances of methodology 
for the computational prediction of biological interactions. Specifically, putative 
TF binding sites (TFBSs) can for instance be computationally predicted using 
nucleotide position weight matrices of known TF binding motifs [25], as those made 
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Fig. 1 Structural organization of two biological networks. The figure shows the topology of a 
BioGrid network including only links of the ‘Phenotype Enhancement’ or ‘Phenotype Suppression’ 
type (a) and a BioGrid network including only links of the ‘PCA’ type (b) 


available for instance through the TRANSFAC [107] and JASPAR [137] databases, 
or by related models [106, 148]. For the prediction of Protein-Protein Interactions 
(PPIs) a plethora of different approaches have been developed, as reviewed in [146, 
171]. Finally, the last decade has also seen the development of numerous algorithms 
for the reverse engineering of TRNs from expression data [4, 44, 65, 68, 69, 104, 
151]. 

Together, the contributions from low-throughput, high-throughput and compu- 
tational efforts have produced a multitude of resources comprising relationships 
between biological entities, which are presented as pathways, interaction or reac- 
tion data in databases such as the Kyoto Encyclopedia of Genes and Genomes 
(KEGG) [81], the Reactome pathway knowledge base [43], the Biological General 
Repository for Interaction Datasets (BioGRID) [156], the STRING database [73], 
Pathway Commons (PC) [29], the Human Protein Reference Database (HPRD) [123], 
the Database of Interaction Proteins (DIP) [136], or the IntAct Molecular Interac- 
tion Database [85], the majority of which are also integrated in the multi-resource 
ConsensusPathDB interaction database [80]. 


2.3 Properties of Biological Networks 


Biological networks generated according to the different approaches and depicting 
different types of interactions will typically look quite different (compare for instance 
Fig. 1), and do not necessarily exhibit the same edges between any given pair of 
genes/proteins. 
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Nevertheless the majority of biological networks are still considered to exhibit 
certain ubiquitous topological properties, which have been reviewed frequently [2, 
12, 13, 100, 175, 199, 204]. 

Specifically, biological networks have been found to exhibit higher clustering 
coefficients than random networks [179, 190, 196], which in turn implies a clustered 
organization with regions with a higher interconnectivity than intra-connectivity 
to the rest of the network, compare also Fig.1. A related modular organization 
has been demonstrated for PPINs [132, 155], metabolic networks [129] and TRNs 
[37, 60, 130]. 

As demonstrated by studies on TRNs [112, 144], PPINs [193], as well as com- 
posite PPINs and TRNs [195], the topology of biological networks is furthermore 
enriched for certain typical patterns of connectivity, referred to as network motifs. 

Additionally, metabolic networks [74, 179], PPINs [52, 75, 95, 157, 196], PDNs 
[190], TRNs [60, 99, 139], gene co-expression networks [28] and GINs [167] have 
also been found to be scale-free, i.e. they are not dominated by nodes with a specific 
number of connections but the distribution of connections per node follows instead 
a power law with a large number of nodes having very few connections and a small 
number of nodes presenting with large number of connections [3], compare Fig. 2. 

Scale-free networks are also expected to exhibit the small world characteristic 
[5, 31], which has for instance been demonstrated for PPINs [52, 95, 196], 
PDNs [190] and metabolic networks [179]. The small world property implies that 
the majority of nodes in the network can be reached from any starting node in the 
network by traversing only a small number of links [186]. 

Given the nature and properties of these networks, certain topological network 
analyses present themselves as powerful tools for extracting particular biological 
features from the respective underlying data, compare for instance the chapters by 
[2, 100, 199]. Specifically, due to the scale-free property, one can always find nodes 
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Fig. 2 Scale-free property of two biological networks. The figure shows the estimated probabilities 
P(k) of a node being connected with k other nodes for a BioGrid network including only links of 
the ‘Phenotype Enhancment’ or ‘Phenotype Suppression’ type (a) and a BioGrid network including 
only links of the ‘PCA’ type (b). Gray dots indicate measured probabilities and red lines represent 
power law fits 
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@ Highest degree @ Highest betweenness @ Highest closeness 


Fig. 3 Illustration of central nodes in a network. The figure shows a scale free network, in which 
the nodes with highest degree centrality (red), highest betweenness centrality (blue) and highest 
closeness centrality (green) have been highlighted 


in those networks that appear more central with respect to certain parameters as 
compared to other nodes in the network, compare Fig. 3. 

The existence of high degree, i.e. hub, genes in biological systems has early 
on been recognized a potential avenue for the development of targeted drug treat- 
ments [11], but the distinct topological properties of such networks have also been 
suggested to lead to the discovery of novel disease genes and thus also therapy 
options. The next section will discuss how such properties might be utilized for 
cancer gene discovery. 


3 Candidate Gene Prediction Using Graph Centralities 


During the last decade numerous computational methods have been suggested for 
the network based prediction of disease and cancer genes. Among those techniques 
a large number assume a “guilt-by-association” [113, 185, 203] or “guilt-by- 
proximity” framework [189] and predict new candidate genes or pathways based 
on their direct functional linkage or network proximity (e.g. presence within the 
same network module) to known disease genes (compare for instance reviews and 
chapters [13, 185, 189]). As a consequence however, these methods rely on prior 
knowledge about existing disease genes in order to predict novel genes. 
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In this chapter, we focus on centrality ranking as an alternative network based 
approach, which has the potential benefit of not requiring prior knowledge about 
existing disease genes for a specific or similar disorder. Instead, one assumes that 
disease genes have very characteristic and determinable positions in their respec- 
tive network. Specifically, considering that highly central nodes, as a more integral 
component of the particular information flow in the network, also engage in more 
important roles in the underlying biology, it is of interest to be able to identify such 
nodes from biological networks. Considering such networks as mathematical graphs, 
a variety of related centrality equations have been defined and applied to extract nodes 
with specific connectivity characteristics. 

After introducing certain general notations to define mathematical networks and 
graphs, this section will review some of the more widely used centrality methods 
and discuss their applicability to biological networks. 


3.1 Some Prerequisites to Centralities 


In mathematics a network is described by a graph structure G(V, EF), where the set 
V = {v1, v2,..., Vp} represents the individual nodes, also called vertices, and the 
set EF = {e;;}, i, j € [1,n] denotes the edges that connect certain pairs of vertices. 
Specifically, an edge e;; will imply that node v; has a link to node v;. Such a graph 
with n nodes can then always be described by an x n matrix A, with element (i, /) 
representing the value of edge e;; between nodes i and j, where the specific choice of 
these edge values dictates the particular type of graph we are modeling. Specifically, 
we will here distinguish between directed and undirected, weighted and unweighted 
networks as well as between graph structures allowing or forbidding self loops. 

In an undirected network, we will always have e;; = e;; and a symmetric matrix 
A, i.e. any edge in the network works in both directions in exactly the same fashion, 
while in a directed network we might encounter pairs of nodes v;,v; for which 
ei; A ej; and the matrix A is potentially non-symmetric. The respective undirected 
graph is usually drawn without any arrow heads, while in the directed graph edges 
will be replaced by arrows in order to indicate direction. 

In an unweighted network, we always have e;; € {0, 1}, implying that the edge 
between nodes v; and v; either exists or not, while in a weighted network the respec- 
tive matrix entries will correspond to so called weights w;,; and can take any value 
from a predefined interval, e.g. wij; ¢ [—1, 1]. These weights are then meant to indi- 
cate the strength of the connection described by the edge or in the case of negative and 
positive weights can also distinguish between an inhibitory and stimulating meaning 
of edges. In the case of a unweighted network the matrix A will also be denoted as 
adjacency matrix, while for a weighted network depending on convention one might 
refer to A as weighted adjacency matrix. 

Self loops designate edges e;,; from node v; to it self. In some types of graphs 
these self loops are allowed, while other explicitly prohibit them. In the adjacency 
matrix A this decision will dictate whether the diagonal is always zero. 
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Finally, another network property that must be considered here is the so called con- 
nectedness of the underlying matrix. Specifically, we say that a network is (strongly) 
connected, if it is possible to reach any other node j ¢ i from node i by traversing 
the existing edges of the network. If this is true for the directed graph it is strongly 
connected, while if it is true only for the undirected graph is is merely connected, 
if neither is true we say that the network is disconnected. This is of importance, 
because if the network is connected we can define a number of additional network 
properties, which do not make sense for a disconnected network. Specifically, we can 
here mention (1) the distance, also referred to as shortest path, between two nodes i 
and j, which is denoted by d(i, 7) or dist(i, 7) and is defined as the smallest number 
of edges that would have to be traversed in order to travel from node i to j and (2) the 
diameter diam(G) of the network, which is simply the largest value of d(i, 7). The 
diameter and some distances are obviously not computably in disconnected network, 
in which case they are often set to d(i, 7) = oo or diam(G) = o. 

Above it was mentioned that graphs can be connected or disconnected, weighted 
or unweighted as well as directed or undirected. Different centrality measures might 
make specific assumptions about the particular organization of the underlying net- 
work and since one might be interested in analyzing more than one type of network 
configuration with the same centrality measure it will often be necessary to expand 
the given centrality method to also work on other types of networks that the centrality 
measure was not perhaps intended for according to its original definition. We have in 
the following section tried to describe the original definition of centrality measures 
and its requirements or assumptions regarding underlying network structures, but 
also attempted to gather potential approaches to modify the given method for other 
types of network structures. 

The majority of centrality measures will produce centrality measures of some 
absolute magnitude, the specific values of which will depend directly on for instance 
network size. Comparing these measures between networks of different sizes is 
therefore not meaningful, but requires a normalization step to transform the absolute 
centralities to relative centralities. We have attempted to gather proposed normal- 
ization schemes for each of the represented centrality measures in the following 
section. 


3.2. Definitions and Visualization of Common Centrality 
Measures 


As of today more than 1 10 different centralities have been described in the literature.! 
The majority of the centrality measures have been developed in other research fields 
for other purposes. Still, depending on the choice of interaction data, many of these 
centrality measures can directly or with slight modifications be applied to biological 
networks (see also Sects. 3.3 and 6.1 for discussions about applicability and mean- 


'A comprehensive list of centralities can be found in the CentiServer (http://www.centiserver. 
org/) [72]. 
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Table 1 Centrality measures based on shortest path* 


Centrality measure Equation Reference 
: 1 
Closeness centrality TCR) [47, 135] 
Stress centrality Digi Digi oxi (Vi) [23, 145] 
Betweenness centrality Digi Dit cut) (23, 46] 
Flow betweenness [48] 
Load centrality [23, 54] 
Eccentricity centrality maxtd(.v))vjeV} [62] 
Mr aad | 
,j(D+1-d(v;,v; 
Radiality/integration centrality oo Aaa Pea [172] 


*Where o is defined as in Sect.3.3.3, D is the diameter of the graph and d(v;, v;) is the distance 
from vertex v; to vertex v; 


Table 2 Centrality measures based on powers of the adjacency matrix* 


Centrality measure Equation Reference 
Degree centrality 1 i,j - 

; : lyn 
Eigenvector centrality 7 Vie! di, j Ceig (Vj) [19] 
Katz status et viel ak (At); [84] 
Page rank Ud — aM?)-ly [24] 
Cumulative nomination (I + A)"e [122] 


*Where A is the adjacency matrix M is a scaled and slightly modified adjacency matrix. a, B are 
scalar parameters chosen appropriately and e, v are the one vector and a non-zero weight vector, 
respectively 


Table 3 Other centrality measures* 


Centrality measure Equation Reference 
Centroid value min{ f(y, vj): vj € V/vi} [152, 192] 
Clustering coefficient Alert e NEUE EH [186] 
Topological coefficient Crc(vi) = aCe [157] 


*Where f(v;, v;) is the difference between the number of vertices closer to v; than v; and the number 
of vertices closer to v; than v;, kj is the number of neighbors of vertex v; and J (vj, vj) defined only 
for vertices which share at least one neighbor is the number of neighbors shared between v;, v; plus 
one if there is an edge between v; and e; 


ingfulness of centralities in biological networks). Here we list the definitions and 
show visualizations for some of the centrality measures most frequently applied to 
candidate gene prediction from biological networks. Specifically, we are separating 
the centralities here into those methods based on shortest path calculations (Table |, 
Fig. 4), based on the calculations of powers of the adjacency matrix (Table 2, Fig. 5) 
and other centralities (Table 3, Fig. 6). 
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Fig. 4 [llustration of centrality measures based on shortest path 
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Fig. 5 Illustration of centrality measures based on powers of the adjacency matrix 


3.3 Applicability to Biological Networks 


As briefly noted above, the implementation and design of individual centralities make 
them more or less applicable to certain network types. In this section we will discuss 
some of the factors that impede general applicability of centralities and discuss some 
modifications and remedies to these problems. Of the centrality measures applied 
to the identification of biologically important nodes, degree, betweenness as well 
as closeness centrality are by far the most frequently utilized and studied methods. 
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Fig. 6 Illustration of other centralities 


Thus, we will start with a more in detail illustration of such considerations about 
applicability on the example of these three centralities, before concluding this section 
with a summarization of similar reflections for the other centrality measures. 


3.3.1 Degree Centrality 


Degree centrality is one of the simplest and most straightforward measures of graph 
centrality and is based only on the number of edges connected to a specific node. 
Specifically, for an undirected network with no loops, the degree centrality deg(v;) 
of a node v; is equal to the number of edges connected to the node. 


Caeg (Vi) = > qij- 


j=l 


If the network has loops, these are typically either ignored (only interested in number 
of neighbors) or counted twice (once for each end of the loop touching the vertex). 
For directed, i.e. asymmetric, networks we define in-degree and out-degree to be 
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Table 4 Modifications of degree centrality 


Directed network Weighted network Normalization 
C yi yn . C \y lyn = C yy — Cadeg Vi) 

degin (i) = Djs! j,i degy (Vj) = Via Wi, j degnorm (i) = N—1 

n 
= y" a. +) Dia 

Caegout (i) = 27 Gi. j Categnorm = Caegmax 
Caegional (vi) FS 

n n 
jai Aji + ja Gi 


the number of incoming or outgoing edges of the node, respectively, as well as the 
total-degree, which is the sum of in-degree and out-degree. 

It should be obvious that degree centrality can be calculated on disconnected and 
weighted networks as well since it only counts the number of edges connected to 
a node. However if working with a weighted network then it makes sense to take 
these weights into consideration by instead calculating the sum of the weights of all 
connected edges. 

Several aspects of normalization have been outlined in [187]. Specifically, we 
normalize by scaling the centrality measures by the maximal possible degree cen- 
trality value obtainable in a network depending on the size of the network, this gives 
a scaling factor 1/(N — 1) if no self-loops are allowed. For the weighted network it 
is common to normalize by dividing by the maximal degree of the non-normalized 
degree. 

A summary of the discussed modifications can be found in Table 4. 


3.3.2 Closeness Centrality 


Closeness centrality [47, 135] is commonly defined as: 


1 


C.(vi) ae di, ¥) : 
The original definition of closeness centrality as defined above only makes sense 
for connected undirected networks where the distance d(v;, v;) is well defined. If 
the network is undirected then it is possible that d(v;, v;) A d(v;, v;) and if it is not 
(strongly) connected then the distance between some nodes will be undefined. 

For disconnected networks, a number of potential modifications to the original 
method have been proposed, some of which have been reviewed in [18]. Throughout 
this discussion we will assume that the distance between two vertices is infinite if 
there is no path between them. 

A simple solution is achieved by ignoring all unreachable nodes in the computation 
of closeness [18]. Another solution was proposed by Chavdar Dangalchev [34], by 
moving the sum out of the quotient and more heavily penalizing long distances by tak- 
ing powers of two. A third solution goes back to the work of Nan Lin (1976) [96], who 
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Table 5 Modifications of closeness centrality 


Disconnected network Ref. Normalization Ref. 
Cw) = Sapam TOD [187] CrcWi) = ai [14, 47] 
COW) = Ligi gasp [34] Cyc) = gash | 11871 
Cui) = tay | 196 

CH) = Diz} aap [133] 


redefined closeness centrality based on so called “nonempty coreachable sets” [18], 
producing a measure also referred to as Lin’s index. Finally, the probably most com- 
monly used alternative definition is the so called “harmonic centrality index” or 
simply “harmonic centrality” [18, 133]. This latter measure is similar to the one 
introduced by Chavdar Dangalchev, but the ordinary distance between the nodes is 
used instead. 

A common method to deal with directed graphs is by calculating either in- 
closeness (using d(v;, v;)) or out-closeness (using d(v;, v;)) similarly to how we 
calculate degree centrality for directed graphs [187]. 

The question if closeness can be applied to weighted networks depends on the 
choice of distance function, the most common distance used being the shortest path 
which can easily be adapted to weighted graphs by regarding edge weights as costs 
and finding the path with minimum cost. 

A scaled version of closeness centrality was proposed by Beauchamp in 1965 [14] 
and rediscussed by Freeman in 1979 [47] in his definition of closeness centrality, 
by multiplying the absolute Closeness with N — 1 to get the average closeness. 
Furthermore, [187] extends this normalization to weighted networks by dividing the 
non-normalized centrality by the maximum possible value. 

A summary of the modifications discussed here can be found in Table 5. 


3.3.3 Betweenness Centrality 


Betweenness centrality [23, 46] can be seen as a measure of how important a node is 
for the communication between other nodes in the network by estimating how often 
it is visited when finding shortest paths between other nodes. 

If we let ox; denote the number of shortest paths between two nodes v,; and v 
and let o;;(v;) denote the number of shortest paths between two nodes vz and v 
that traverse through node v; then betweenness centrality for a connected, directed, 
unweighted network can be formally defined as 


Oxi (Vi) 
Coetween (vi) = pS ui F 


0; 
keizn 
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Since we are calculating shortest paths, disconnected networks pose a problem. 
While we have not found a documented solution in the literature, a simple solution 
is to set ou) = Oif d(vx, v;) = & similar to how we did for closeness centrality. 

The above holds for both directed and undirected networks, although for undi- 
rected networks where ou) = cut) it makes sense to divide the total with 2 since 
we would otherwise count each path twice or modify the algorithm to only calculate 
shortest paths for one symmetric half of the network [187]. 

Similar to closeness, weighted networks can be handled in a similar way by 
viewing edge weights as the cost of traversing said edge and finding the path with 
the smallest cost. 

Again, we normalize the raw betweenness centrality values by division with the 
maximum possible centralities. The respective maximum values have been given 
in [187], which considered the current node to be the center of a star-network 
according to [46, 47]. Specifically, for a undirected network the maximum possi- 
ble centrality value becomes C77) = ees while for a directed network the 


between —_ 


p ‘ : max Se eas 
normalization factor is Cpen oon = 3n +2. 


3.3.4 Summary of Applicability Considerations 


Table6 summarizes for all included centralities, whether they are applicable to 
directed, weighted and disconnected networks, respectively. A method that can be 
used on directed but not disconnected networks needs it to be strongly connected 
in the directed graph unless noted otherwise. Some methods have problems with 
single unconnected vertices but otherwise work for disconnected networks, in this 
case centrality is usually set to zero (or some other suitable value) hence we consider 
these applicable to disconnected networks as well. Similarly most methods relying 
on shortest paths can easily be used on weighted graphs by instead using the shortest 
distance on the weighted graph instead. 


3.4 Linear Combinations of Centralities 


It has been mentioned that different centralities might operate differently or are not 
defined on certain network types, such as disconnected or directed networks. In addi- 
tion, when using a set of centralities, one should also consider possible redundancies 
between certain centralities. 

For example on many weighted networks it is reasonable to assume there is only 
one shortest path between any pair of vertices, which would imply that betweenness, 
stress and load centrality will in this case all be the same. Another example is Katz 
Status which can be rewritten as ((J — wA’)~! — De using a Neuman series and it 
is obvious that Katz Status is equivalent to Bonacich Alpha/Beta centrality minus 1. 
Similarly, if @ is small (close to 0), then Katz status and Degree centrality give a 
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Table 6 Applicability of centrality measures 


Centrality measure Disconnected | Directed | Weighted Normalization* 

Closeness centrality [18, 34, 96, [187] [116] [14, 47, 187] 
133] 

Stress centrality Yes Yes [127] Yes(o) 

Betweenness centrality Yes [187] [116] (23, 187] 

Flow betweenness Yes Yes Yes Yes(t) 

Load centrality Yes Yes [23] [23] 

Eccentricity centrality Yes Yes Yes Yes(t) 

Radiality/integration centrality | [172] Yes [116] [172] 

Degree centrality Yes Yes Yes [187] 

Eigenvector centrality No Yes (20, 21] [134, 187] 

Katz status Yes Yes Yes Yes(o) 

Page rank Yes Yes Yes Yes(p) 

Cumulative nomination [122] Yes Yes Yes(o) 

Centroid value No Yes* Yes No 

Clustering coefficient Yes Yes No Yes(t) 

Topological coefficient Yes No No Yes(t) 


*Where (t) denotes normalization using a theoretical maximum, (0) denotes normalization using the 
observed maximum using corresponding non-normalized centrality and (p) denotes normalization 
into a probability density (non-negative ranks with sum one) 

*Requires a choice of direction for “distance” calculations and the network to be strongly connected 
instead of simply connected 


similar ranking since higher order terms in the sum disappear quickly when calcu- 
lating Katz Status. 

Thus, depending on the network context, centralities might be redundant due 
to highly similar enrichment profiles or they could act complementary. In order to 
quantify these ranking similarities on a given network and between a set of centralities 
one can for instance inspect a correlation matrix [89, 90] (compare Fig.7), where 
one would preferably employ a ranking correlation coefficient such as Kendall’s tau 
which allows for ties. 

As a consequence of the observed differences in centrality based enrichment 
patterns, it has been suggested that multiple centralities should be employed when 
ranking genes in biological networks [89]. Importantly, it can even be argued that 
linear combinations of certain centrality measures can act complementary and allow 
for the enrichment of novel, distinct sets of nodes. For instance combining degree and 
betweenness centrality can lead to the enrichment of specific nodes that are central 
in terms of connections but also shortest path within the network, compare Fig. 8. 
Thus, inspections of complementary enrichment patterns and investigations of linear 
combinations of centrality measures might be beneficial also in the identification of 
cancer genes. 
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Fig. 7 Rank correlation between centrality rankings. The figure shows a heatmap of the pairwise 
Kendall’s tau rank correlation coefficient between centralities calculated on the BioGrid network 
including only links of the ‘Synthetic Haploinsufficiency’. For the computation of centralities, only 
the largest symmetrized adjacency matrix of the largest network component was included 


Specifically, a number of studies have employed combined centralities for the pri- 
oritization of genes, sometimes even stating combinations as a requirement in order 
to see a centrality based enrichment of genes with a certain phenotype association. 
Siddani et al. [147] have used a combination of ten centralities to identify novel 
candidate genes for the Systemic Lupus Erythematosus disease. Bhattacharyya and 
Chakrabarti [16] prioritized proteins in PPI networks of Plasmodium falciparum and 
argued that integrating all employed centrality measures was necessary for identi- 
fying “truly central proteins”. del Rio et al. [35] investigated the centrality based 
prediction of essential genes from metabolic networks of Saccharomyces cerevisiae 
and found that at least two centrality measures had to be employed together in order 
to achieve a statistically significant identification of essential genes. 


3.5 Implementation 


Seeing the general applicability of centrality measures, a wide variety of packages 
and standalone softwares not specifically tailored for a biological use can be found 
for the R or MATLAB platforms, such as the R packages sna [27], igraph, [33] 
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Fig. 8 Illustration of central nodes in a network. The figure shows a scale free network, in which 
the nodes with highest degree centrality (red), highest betweenness centrality (blue) and nodes with 
the a high simultaneous score in both centralities (violet) have been highlighted 


and CePa [58] or the MatlabBGL package [53]. In addition, inspired by the growing 
importance of centrality related questions in biological networks, numerous modules 
specifically intended for the use on biological networks have been introduced during 
the last years, including for instance the CentiBin [79] and CentiLib [57] software 
tools as well as the CentiScape plugin [140] for the widely used biological network 
illustration tool CytoScape [160]. A more comprehensive list of centrality software 
resources can even be found on the CentiServer (http://www.centiserver.org/), a 
recently published tool for the calculation of a very large collection of network 
centralities through the use of a web interface or R package [72]. 


4 Determining Enrichment of Cancer Genes Among High 
Centrality Nodes 


Despite a substantial body of investigations during recent years, the exact relation- 
ship between cancer genes and graph centralities remains largely unresolved. Hence, 
before utilizing centrality measures to nominate gene targets from regulatory net- 
works, we must evaluate to which extent the selection of high centrality nodes will 
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lead to an enrichment of cancer genes. The extent of such a relationship may well 
vary between different settings and should be considered carefully on a case-by-case 
basis. 

One important consideration is which genes we regard as cancer genes. For this 
purpose, one can make use of a number of resources and databases in order to classify 
genes into cancer related and cancer unrelated subsets. Examples of such databases 
are the Catalogue Of Somatic Mutations In Cancer (COSMIC) [45], the pathways 
in cancer set of genes from the KEGG database [81], the Candidate Cancer Gene 
Database [1], the Network of Cancer Genes (NCG) [6], or the IntOGen-mutations 
platform [56]. However, it should be noted that any list of genes can be used. The 
gene list used to evaluate the performance of a network inference method or cen- 
trality measure should be chosen so that nominating similar genes is of interest for 
downstream analysis. 

When investigating the relationship between centralities and cancer gene status, 
there are two main questions that can be addressed and represent different forms of 
enrichment. Specifically, one might investigate (1) if the most central genes are more 
often cancer genes or (2) if there is a tendency towards cancer genes having a higher 
centrality. 

To answer the first question researchers often simply compare the mean or median 
centrality value between phenotype-related and phenotype-unrelated genes [63, 70, 
77, 117, 194] or select a number of top scored genes [117, 147] among which one 
could quantify the over-representation of phenotype related genes. The comparison 
of means or medians can be performed using standard tests. The enrichment of 
cancer genes among the top central genes can be quantified using a hypergeometric 
or Fisher’s exact test. 

The first question is thus straightforward to answer, and it might be informative 
for nominating gene targets, but it lacks nuance since it does not take into account the 
distribution of centrality values. The second question may therefore be more useful 
as a performance benchmark. 

The analysis of enrichment of centrality values bears a strong resemblance to gene 
set enrichment analysis often considered when interpreting gene expression in rela- 
tion to some measured phenotype. In this setting the question considered is whether 
a measured phenotype has a significant association with the expression of genes in 
a certain category (e.g. pathway membership or functional annotation). Many meth- 
ods exist for this purpose (e.g. [38, 86, 161, 164]). Such methods generally work by 
first quantifying the association of individual genes with the phenotype, in essence 
creating a ranking of the genes, and then quantifying the difference in distribution of 
these associations, comparing genes contained in a category and those not contained 
in that category. In our case the ranking of genes is the ranking of centralities, and 
the gene category in question is the set of genes considered to be cancer related. 

To measure the significance of centrality enrichment among cancer genes, we 
here propose a simple method based on the enrichment statistic used in the GSEA 
method [38]. First, we start by sorting the centralities in decreasing order, then iterate 
along this ranked list while keeping a running sum that is incremented when we 
encounter a gene that is cancer related, and decremented when we encounter a gene 
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that is not. From the set of vertices V = {v1, v2,..., Vj,.--, Vn}, ordered according 
to the magnitude of their centralities C(v;), and a set of nodes S € V, we obtain the 
size of each term in the sum as: 


1/|S|, if vy; in S, 
1/(V| —IS)), if v; notin S. 


The test statistic (or, enrichment score (ES)) is defined as the largest absolute value 
of the running sum obtained throughout this iteration. An empirical p-value for the 
enrichment is obtained by comparing the observed test statistic to a null-distribution 
obtained by repeated random permutation of the ranked list and calculation of the ES 
for each permutation. In Fig. 9 we illustrate one application of this method using the 
BioGrid network in Fig. la and the COSMIC cancer genes. However, this approach 
can be used with any set of genes, for instance GO or KEGG to determine whether 
a ranking of centrality enriches for genes with a particular biological function. 
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Fig. 9 Test for enrichment of cancer status among high centrality genes. The figure shows the 
enrichment of COSMIC genes among high page rank scores in the BioGrid network including only 
links of the ‘Phenotype Enhancement’ or ‘Phenotype Suppression’ type. Top panel distribution of 
degree values; middle panel cancer status of genes; bottom panel step function, where at each step 
the enrichment score (ES) increases if the gene is a cancer gene or decreases if the gene is unrelated. 
The p values is estimated by calculating the percentage of permutations of cancer gene affiliations 
with ES scores greater than the one observed. For the computation of the centrality, only the largest 
symmetrized adjacency matrix of the largest network component was included 
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5 Recent Results and Progress in Centrality Based 
Prioritization of Disease and Cancer Genes 


Since the first emergence of studies investigating the centrality in biological net- 
works, the further development and application of related methods has grown to an 
established field of biological research. Advancements in this field have been made 
in basically two directions, which however progress hand in hand. These directions 
comprise on one hand the development of novel centrality measures and software 
facilitating centrality application to biological networks and on the other hand include 
evermore intricate studies exploring the use of centralities for the ranking of biolog- 
ical entities with certain properties. 

Specifically, initial findings relating centralities to important genes or proteins 
were established using general centrality measures that have previously been defined 
from other scientific areas such as social sciences. However, recent years have 
also seen the dawn of many novel centrality measures, inspired by or explicitly 
defined for ranking problems in biological networks [83, 91, 94, 162, 163]. In addi- 
tion, a number of software tools or extensions more tailored for the investigation 
of centralities in biological networks have been developed, including for instance 
the CentiBin [79] and CentiLib [57] standalone implementations, the CentiScape 
plugin [140] for CytoScape [160], or the web interface and R package provided by 
the CentiServer [72]. 

The interest in investigating the relationship between disease gene status and 
graph centralities was likely inspired by the initial observation in model systems 
suggesting that there might exist a correlation between the essentiality of a protein 
and its centrality ina PPI [40, 41, 63, 75, 78]. While the identification of essential 
proteins by the use of centrality measures has continued to draw scientific interest 
until today [35, 87, 94, 142, 163], these initial findings where subsequently also 
succeeded by a number of experiments that more closely studied the association of 
centralities with disease or cancer gene status. 

Specifically, a study on lung squamous cell carcinoma tissue has reported that 
genes with upregulated expression in the cancer tissues showed a higher degree 
than genes with unaltered expression levels [178]. Similarly, Jonson and Bates [77] 
showed that in human PPIs consensus cancer genes, i.e. genes with reported muta- 
tions in cancer, have a higher degree centrality than genes not found mutated in 
cancer. Another study investigated the centrality of OMIM derived disease-genes 
obtained in literature-curated PPIs and found the disease genes to exhibit a higher 
degree centrality than non-disease genes [194]. Using a small number of prostate can- 
cer genes from the OMIM database as seed genes in a literature-mined interaction 
network, Ozgiir et al. [119] found that centrality ranking could be used to enrich for 
genes with known prostate cancer association. A study of disease genes for primary 
immunodeficiency (PID) combined network centralities and GO ontologies to rank 
genes in a human immunome network and was able to identify a number of already 
known PID genes [117]. Similarly, Sidanni et al. [147] predicted Systemic Lupus 
Erythematosus (SLE) genes from two different Human immunome networks also 
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using a combination of centralities and GO ontologies and found a large proportion 
of the predicted genes to represent known SLE genes. Starting with a large dataset 
of PPIs from the HPRD database, Izudheen and Sheena [70] performed centrality 
comparisons between cancer, cancer-chance and non-cancer genes in smaller sub- 
sampled networks and found that cancer and non-cancer genes differed in several 
centrality measures. 

One often cited study that debates the use of particularly degree centrality for the 
enrichment of disease genes is the work by Goh et al. [55]. The authors established a 
“disease gene network” by connecting any pair of disease genes obtained from the 
Online Mendelian Inheritance in Man (OMIM) database, which was found associated 
with the same disease. While disease genes where found to account for high degree 
nodes in this network, this trend disappeared when excluding disease genes that are 
also embryonically or postnatally lethal. Particularly, Goh et al. suggest that essential 
genes in their pan-disease network are likely to form hubs, while the majority of 
disease genes, being non-essential, are located in the periphery with low degree 
centrality. However, the authors also report that disease genes caused by somatic 
mutations actually show a higher degree centrality and tendency to coincide with 
hubs. In addition, while this study has raised some concerns regarding the separation 
of essential genes and disease genes and the use of degree centrality to predict disease 
genes, one should bear in mind that the study investigated only one type of centrality 
in a pan-disease networks rather than direct molecular interaction network. Thus, 
the results may not exclude the possibility for associations between centralities and 
disease genes in other network types. 


6 Open Questions and Future Challenges 


6.1 Which Network to Choose 


When attempting to address a certain biological question, some network types might 
be more appropriate than others. However, in addition one should also consider how 
such data analyses might be influenced by the way in which the related networks 
have been generated. 

As mentioned above, the interactions of many biological networks can be derived 
in a variety of ways, the exact choice of which might bear some influence on the 
accuracy and completeness of the network. Specifically, networks solely established 
from low-throughput experimental data might exhibit low false-positive rates, but a 
large number of false negatives and additionally present with a bias towards interac- 
tions of molecules which are of greater scientific interest [51], such as for instance 
disease proteins [118]. High-throughput methods, as exemplified by protein-protein 
interaction assays, on the other hand might exhibit larger false-positive rates and 
could further be influenced by a variety of different biases [17, 51, 177]. 
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Another important factor to consider is the generic nature of many interaction 
and predominantly PPI databases. Specifically, as such databases often represent 
aggregations of data from various sources, such as tissues, laboratories or methods, 
the contained interactions might be considered as a collection of possible interac- 
tions in an organism, but often providing insufficient information about when and 
where a given interaction is present or not. Since such temporal and spatial patterns 
of interaction might differ substantially between different tissues and diseases, such 
databases might only be of limited use when attempting to prioritize disease genes 
for a specific disorder. In order to remedy this lack of tissue-specificity in generic 
databases, several integrative methods have been suggested during the last years 
[22, 59, 101, 180]. For other types of interaction data, such as transcriptional regula- 
tory relationships, transcription factor binding or genetic interactions, many tissue- 
and disease-specific datasets are publicly available and can be utilized to estimate 
the underlying networks. For instance, as mentioned above, a number of different 
methods exist for the inference of gene regulatory networks from expression data 
[4, 44, 65, 68, 69, 104, 151]. Individual techniques and especially community inte- 
grations of various techniques achieve ever increasing accuracies for the prediction 
of individual interactions [102]. However, it is still largely unexplored how well 
these methods can reconstruct the overall topology and thus also centralities in such 
estimated networks [188]. 


6.2 How to Determine Phenotype Specific Candidate Genes? 


Above it was discussed that depending on the choice of interaction resource, net- 
works utilized for cancer gene prioritization might lack tissue-specificity. However, 
even when prioritizing genes from a tissue- and disease-specific network, there still 
remains the question of whether the high centrality observed for a candidate gene is 
due to its association with the given phenotype. 

Specifically, it can be assumed that genes and proteins with central roles in the 
normal cell’s function also take central positions in respective biological networks, 
for instance master/global regulators in GRNs [90]. If those genes play crucial roles 
for cellular function and survival in the healthy tissue, it is reasonable to expect that 
a proportion of those genes, such as essential proteins and housekeeping genes [55], 
even has high centrality in the disease network without being actually linked to the 
disease phenotype. Hence a selection of network nodes with high centrality would 
naturally also include a number of genes which play a central role in the cells function, 
regardless of whether it belongs to a cancer or healthy individual. The underlying 
topological overlap between networks of healthy and disease phenotype creates a 
marked problem for the prediction of candidate cancer genes. 

In order to overcome such a contamination by genes always central in a cell’s 
molecular system, one approach might be to scale or modify centralities observed in 
a cancer derived network based on the centralities of the equivalent genes in a network 
derived from the healthy counterpart. Alternative approaches could also make use 
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of the fact that molecular networks are often enriched for small regulatory motifs 
[37, 112, 144, 195, 198] and that part of cancer development can be understood as a 
perturbation of the interactions in the healthy network [175, 185]. Thus, identification 
of cancer specific candidates could involve the identification of changes to central 
network structures and motifs [175, 185, 201] or the identification of cancer network 
enriched motifs [30]. 


6.3 Biological Context of Centralities 


As discussed in Sect. 3.3, given a certain network type, it is in most cases straight 
forward to make a selection of centrality measures that are mathematically applica- 
ble and meaningful. However, less is known about the biological meaning associated 
with individual centralities. Specifically, one has to wonder what principles of dis- 
tance, neighborhood or information flow as used by centralities would signify in a 
biological context and if there can actually be some biological property correlat- 
ing with these centralities. Cases, in which centrality ranking actually leads to the 
over-representation of cancer or disease genes might provide direct feedback about 
a potential functional or phenotypical association. However, this particular type of 
cancer gene prioritization would certainly gain in scientific soundness, if centralities 
could be shown beforehand to have a biological meaning. 

There are some mentions of a further distinction of biologically useful centralities 
in the literature. For instance, from a exhaustive collection of centralities discussed 
and implemented in the CentiServer, the authors presented a subset of measures 
more appropriate for biological networks [72], although it is unclear, whether this 
selection was made due to applicability considerations from a mathematical or bio- 
logical perspective. On the other hand, the publication introducing the CentiScaPe 
plug-in for Cytoscape provides interpretations of the potential biological meaning 
represented by a number of centrality measures [140]. However, these efforts only 
cover a small number of the existing centralities. Considering furthermore the vast 
variety of biological networks and the complex interaction dynamics of the under- 
lying systems, it appears that we have just begun to link the concept of centralities 
with biological functions. Considering interpretations as such provided by [140], it 
remains to be shown how one could quantitatively validate a novel interpretation let 
alone identify such an interpretation for a yet uncharacterized centrality measure. 

One potential avenue for associating biological properties and centrality mea- 
sures could be the exploration of functional annotations such Gene Ontology terms 
in the context of centrality rankings. It has previously been shown that prioritiza- 
tion based on centrality and GO terms can be combined for the identification of 
essential proteins [87] or disease genes [117, 147]. Additionally, some studies have 
investigated the enrichment of functional annotations in centrality prioritized gene 
signatures. Specifically, Siddani et al. [147] performed GO enrichment analyses on 
top centrality scored genes in a human immunome network and found an enrich- 
ment of important immunology related functional annotations. Wang et al. [180] 
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performed a GO and KEGG pathway enrichment analysis on different centrality 
based gene sets obtained from a context-constrained breast cancer network, which 
was obtained by projecting multiple breast cancer signature genes onto a PPI net- 
work. The authors investigated for each centrality the KEGG pathway and GO 
term with highest significance and interestingly found the “pathways in cancer” 
as the top KEGG category enriched in all centrality derived signatures. Ortutay and 
Vihinen [117] investigated GO enrichment among the 50 highest ranking genes 
extracted from a human immunome network. However, the authors only reported a 
few of the top ranking GO categories and noted the presence of the top scored term 
in all three centrality selected datasets. 

It would be interesting to expand on such investigations, to explore whether and 
which types of functional annotations could be associated with individual centrality 
signatures in various types of networks and tissues. 


7 Conclusion 


Network based ranking methods have emerged as important tools for the prioriti- 
zation of targetable cancer driver genes. However, many of such techniques rely on 
“guilt-by-association” approaches in order to predict genes or pathways related to 
known disease genes, which represents with limitations and bias due to the require- 
ment of prior knowledge. Here we review an alternative approach that operates 
without the requirement of prior knowledge through the use of network centrali- 
ties. While such topological ranking methods are commonly used, the relationship 
between centralities in various network types and cancer gene status is still poorly 
understood. The centrality measure used, and to which network it is applied, impacts 
how we should interpret the results, and care must be taken when validating each 
approach. For these purposes it is essential to understand what the network repre- 
sents, and how different measures of centrality reflect various biological contexts. As 
always, even though much has been written on the topic, much work remains before 
we properly understand how network centrality can be used to prioritize targetable 
cancer driver genes. Two important pieces of this puzzle are the reference gene set 
used and what measure is employed to benchmark different methods, making them 
important topics for further study. 
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Output Rate Variation Problem: Some 
Heuristic Paradigms and Dynamic 
Programming 


Gyan Bahadur Thapa and Sergei Silvestrov 


Abstract The output rate variation problem stands as one of the important research 
directions in the area of multi-level just-in-time production system. In this short 
survey, we present the mathematical models of the problem followed by consideration 
of its NP-hardness. We further carry out the brief review of heuristic approaches that 
are devised to solve the problem. The dynamic programming approach and pegging 
assumption are also briefly discussed. The pegging assumption reduces the multi- 
level problem into weighted single-level problem. A couple of the open problems 
regarding ORVP are listed at the end. 


Keywords Just-in-time - Objectives - Constraints + Heuristics - Dynamic 
programming 


1 Introduction 


The output rate variation problem is the multi-level production sequencing prob- 
lem in just-in-time (JIT) work environment. Toyota company in Japan invented the 
just-in-time production systems (JITPS) and mostly benefited around the decade of 
sixties-seventies. The problems in JITPS are categorized in two parts, namely single- 
level, called production rate variation problem (PRVP) and multi-level, called output 
rate variation problem (ORVP). The PRVP has been richly studied, for example in 
[3, 13, 28]. The PRVP deals only with the final assembly line, having polynomial 
time solutions whereas the multi-level problem deals with overall systems from raw 
materials to final customers. The ORVP consists of several levels in the production 
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supply chain, for example, raw materials — components — sub-assemblies — prod- 
ucts — distribution centers — retailers — customers. In this supply chain system, 
the multiple copies of different models are produced at the final assembly level, 
which is interlinked with several upstream production levels where raw materials are 
procured, stored and fabricated to produce the final products [2] and with several 
downstream distribution levels where final products are stored and distributed to the 
retailers and then to the customers. 

The whole body of supply chain consists of inbound logistics along the production 
levels and outbound logistics along the distribution levels. The synchronized view 
of seven levels of production and supply chain network has been presented in [27]. 
There may be several sub-levels in between any two production levels. Therefore, 
the formulation of the ORVP contains L, (J = 1,2,..., L) levels. 

The rest of the paper is organized as follows: Sect.2 presents the mathematical 
formulations of ORVP followed by its NP-hardness. Section3 describes the Goal 
chasing heuristics developed by Toyota. The pegging assumption to convert the 
ORVP in terms of weighted PRVP is exhibited in Sect.4, whereas the dynamic 
programming solution is reported in Sect.5. Finally, Sect.6 concludes the paper 
pointing out some of the open problems. 


2 Mathematical Formulation of ORVP 


Assume that the mixed-model multi-level JITSP (i.e., ORVP) consists of L levels 
of manufacturing operations, indexed by /, /=1,2,...,Z with the first product 
level 1. The number of different part types and the demand of item / in level / are 
denoted by m; and dj; respectively, where i = 1, 2, ..., ;. The number of total units 


of item i at level / required to produce one unit of product p is denoted by fj) such 
ny 


that dj, = > tilpdp 18 the dependent demand for item i at level / determined by the 
p=1 
final product demands d,|, p = 1,2,...,n, and/ = 1,2,..., L. Note that tj, = 1 if 


ny 
i = p and 0if otherwise. Finally, D; = > dj, denotes the total demand at level /, and 


i=l 
ny 


the ratio rj) = a gives the demand rate for item 7 of level / such that > rj, = 1 at 


each level / = 1, 2,..., L. Itis noteworthy that the model of ORVP is secunned to be 
non-preemptive; that is, once commenced production of a product at level 1 must be 
completed prior to switch into another unit. This creates the concept of various stages 
or cycles in the production system. The production schedule at level 1 consists of 
Dy stages in total and at each stage a single unit of an end-product can be processed. 
An item is said to be in stage k, (k = 1,2,..., D,), if k units of product have been 
produced at level | and there will be k complete units of various products p at level 
1 during the first k time units. 
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Let x; be the necessary quantity of item i produced at level / during the time units 
ny 


1 through k and yy, = xin be the cumulative quantity of item i produced at level 


i=1 
ni 


/ during the same time units such that yj, = Dy: ik = k. Due to the pull nature of 


i=l 
the JITPS, the particular combination of the highest level products produced during 
the k time units (the x,,, values) determines the necessary cumulative production 


at every other level. Thus, the required cumulative production for item i at level / 
ny 


with / > 2 through k time units is given by xj, = > tilpXpik- For a unimodal convex 
p=1 

penalty function F;,i = 1, ..., mj; with minimum 0 at 0, the maximum deviation and 

the sum deviation multi-level JIT sequencing problems in mixed-model systems (i.e., 

ORVP) are mathematically formulated to minimize the objectives Zjngx and Zsym as 

the followings [14, 18]: 


Zmax = min max Fj (Xilk — YiVil)s (1) 
Ll, > 


ny 


DL 
Zsum = Min -, eS » Fj (Xik — Yrit)s (2) 


k=1 [=1 i=1 


subject to 
ny 
Xi = > tipXpits b= Tyee, b= 1, eo, R=; oe, Di, (3) 
p=1 
n 
yn = > *ie, b=, 3, 2005 Ly KS yee Dig (4) 
i=1 
ny 
yik = > pe =k, k=1,2,...,Dh, (5) 
p=1 
Xplk = Xpi(k-1)> pH=1, 2,225, k=1,2,...,Di, (6) 
Xp1D, = 4p1, Xplo = 0, p= 1,2, see MM, (7) 
Xik =O integer, i1=1,...,m, J=1,...,L, k=1,...,Dh. (8) 


Here the constraint (3) ensures that the necessary cumulative production of part i of 
level / by the end of time unit k is determined explicitly by the quantity of products 
produced at level 1. Constraints (4) and (5) show the total cumulative production 
of level / and level | respectively during the time slots | through k. Constraint (6) 
ensures that the total production of every product over k time units is a non-decreasing 
function of k. Constraint (7) guarantees that the demands for each product are met 
exactly, and (8) is the integrality constraint. The constraints (5), (6), (8) jointly ensure 
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that exactly one unit of a product is scheduled during one time unit in the product 
level. 

The particular cases of the objectives (1) and (2) are studied in literature [2, 27] 
as absolute and squared deviation objectives in both cases as follows: 


Zax = min max IXink — YinTitl » (9) 


max 


Z = min max (Xiik = yerit)”s (10) 


D, L nN 
= min >) >) in — vail» (11) 
k=1 l=1 i=l 
D, L avn 
x : 2 
Zin = mn y >» > Gin _ Yikril) . (12) 


k=1 [=1 i=1 


The ORVP is a nonlinear integer programming problem, whose objective func- 
tions describe the sequence dependent nature of the schedule for lower parts. The 
required cumulative productions x;,.’s, / > 1 are calculated directly from the assem- 
bly sequence of the products x;,,’s, and the desired production goal for model i in 
level / is calculated as the ideal proportion (rj) of the total cumulative production 
quantity (yj) of level /. Balanced schedules are generated by keeping the required 
production of all parts and products as close to this goal as possible. 

The min-max objectives of ORVP aim to find a smooth schedule in every time 
period for every output. This is the basic concept underlying Toyota’s sequencing 
algorithm [20]. Moreover, the value of the objective function Z?, represents an 
applicable physical application, providing the maximum overproduction or under- 
production (the maximum inventory or shortage) from the desired quantity of pro- 
duction that occurs at any time in the schedule. This fact may be used to determine the 
number of kanbans (or the necessary safety stocks) used [16]. The min-sum objec- 
tives of ORVP seek optimal schedules that may have relatively large deviation in a 
single period or for a certain output while having the lowest possible total deviation. 


2.1 The NP-Hardness of ORVP 


For an input size n of a problem P, a generally accepted minimum requirement for 
an algorithm to be considered as efficient is that its running time is polynomial in 
n, denoted by O(n‘) for some constant c. A decision problem is a problem whose 
output is a single Boolean value: yes or no, true or false, on or off etc. Based on 
this definition, there are three classes of decision problems: P (solvable in polyno- 
mial time), NP (Non-deterministic polynomial) and Co-NP (complements of prob- 
lems in NP). To this end, the stunning conjecture is whether P is equal to NP. 
For a detail literature of computational complexity classes, we recommend [6, 7, 17, 
22, 26]. 
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The crux of a combinatorial problem is to develop an algorithm that guarantees 
identifying an optimal solution for every instance of the problem. Unfortunately as 
illustrated above, not all combinatorial problems possess an algorithm with small 
amount of computer time. For example, Steiner tree problem, 3-partition problem, 
exact 3-dimensional matching problem are some intractable problems. 

The ORVP with the sum of the square deviation objective has been shown to 
be NP-hard in the ordinary sense [13]. This result has been achieved by reducing 
the scheduling around the shortest job (SASJ) problem to the ORVP. The scheduling 
around the shortest job problem finds a schedule, on a single machine, of independent 


jobs i,i = 1,2,...,n, that minimizes the sum SG _ Cy, where C; is comple- 
tion time of job 7, i= 1,2,...,n with sees ab times p; < p2 <... <Ppn. The 
SASJ problem is NP-hard in the ordinary sense [12]. Moreover, the min-sum ORVP 
problem is computationally more difficult and the results established so far on the 
completion time variance minimization problem indicate that even special cases of 
ORVP are NP-hard. 

Furthermore, bottleneck ORVP with absolute-deviation objective that considers 
only two levels of production has been proved to be NP-hard in the strong sense. An 
instance of the 3-partition problem can be reduced into an instance of ORVP with two 
levels in pseudo-polynomial time [16]. The 3-partition problem is to decide whether 
a given multiset of integers can be partitioned into triples that all have the same sum. 
That is, for 37m integers, is there a partition {A;, Az, ..., Am} of the set {1,2,..., 3m} 
such that >a; = B, 1 <i <™m, where q; is a positive integer, | < i < 3m and B 

icA; 
3m 


is a bound such that da; = mB, 8 <dj < 8 ? The well-known fact is that the 


i=l 
3-partition problem is strongly NP-complete [22]. 


3 Heuristics Paradigms for ORVP 


A number of sequencing methods as heuristics has been developed and reported 
with comparison in the literature due to the popularity of JITPS evolved during the 
1980s [4, 24, 25]. It is noteworthy that the heuristic approach for PRVP has been 
recently reported in [29]. In this work, we report the heuristics for ORVP. A complex 
heuristic for selecting the production sequence when the objective is to minimize 
the chance of stopping the line due to overloading individual stations is proposed 
in [21]. In this heuristic, the authors suggest a procedure which uses many different 
initial sequences. For each initial sequence, an improvement routine is applied in 
which jobs are moved until no improvement occurs, followed by an interchange of 
jobs until no improvement occurs. The best of the several sequences is the solution. 
The empirical results are presented for problems with up to 100 jobs, which suggest 
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that the heuristic performs almost as well as a branch and bound procedure with a 
CPU time trap of 2 seconds (see [9] also). 

Monden [20] developed the two greedy heuristics at Toyota, which he referred as 
goal chasing methods: GCM I and GCM II (see [11] too). The heuristics GCM I and 
GCM IL designed with product level and sub-assembly level, constructed a sequence 
filling one position at a time from first slot to the last one, considering the variability at 
the sub-assembly level. In comparison of GCM I, the GCM II represented a decrease 
in computational time, since the sum is formed only on the components of a given 
product [24]. However, the comparative research in [24] and in [25] showed that 
GCM I performed better than GCM II when compared on the basis of maintaining 
a constant usage of component parts. These heuristics has been found to yield very 
good results in the Toyota [10]. 

Hyundai’s heuristic (HH) used an alternative way, which was developed to approx- 
imate the result given by GCM I while reducing the steps of computation. Duplaga 
and Bragg [4] concluded that the reduction in computational effort related to HH 
may be significant in situations similar to automobile assembly where many options 
and choices are available for final product configurations. 

The GCM has been advanced to the extended goal chasing method (EGCM) to 
consider all levels in a multi-level production system [18] and introduced another 
polynomial heuristic to reduce the myopic nature of the previous heuristic. Moreover, 
the myopic nature of the GCM I has been reduced and an exact procedure based 
on the bounded dynamic programming is developed in [1]. In the following three 
Subsections, we briefly formalize the goal chasing heuristics. 


3.1 Goal Chasing Method I 


The goal chasing method I (GCM I) was developed and used by Toyota to schedule 
automobile final assembly lines. It constructs a sequence filling one time unit at a time 
from first slot to the last one. This method is designed with the two levels: the product 
level and the sub-assembly level, considering the variability at the sub-assembly level 
only, whereas the variability is ignored at the final level [18]. 

For a stage k, the objective function used in GCM I to schedule the product i is 


minimize cm I= ~, [ian + tik — vara) . (13) 


i=1 


The GCM Tis a myopic heuristic. This heuristic yields infeasible sequence frequently 
but if it yields a feasible sequence, then the sequence is necessarily optimal too [24]. 
The time complexity of GCM Lis O(mnD). 
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3.2. Goal Chasing Method IT 


As in the case of GCM I, the GCM II constructs a sequence filling one time unit at 
a time from first slot to last one. The GCM II is designed to decrease computational 
time because the sum is formed only on the components of a given model [18]. 
This indicates that the computational time can be considerably saved if a model 
encompasses only a small fraction of the total number of parts [25]. 

For stage k, the objective function used in GCM II to schedule the product i is 


minimize ce = >° [xix - vara) , (14) 
ieC 


where C is the set of components of a given model. If C contains a small fraction of 
total number of components, the computational time is substantially reduced. This 
heuristic is also myopic and frequently generates infeasible sequence. 

The goal chasing method has been extended to consider all levels in [18], which 
is called extended goal chasing method (EGCM). It can be said that the GCM I and 
GCM IJ are special cases of the EGCM. 


3.3 Extended Goal Chasing Method 


The extended goal chasing method (EGCM) is also a heuristic for multi-level problem 
since it includes more levels [18]. For a stage k, the objective function used in EGCM 
to schedule the product i is 


dD, L ny 
minimize [sow => Gin — yn | (15) 


k=1 l=1 i=l 


where the weight w; determines the relative importance of a level /. The heuristic 
sequences model i at time unit k with minimum 


Ln 


> Gite + tine — Yuri) (16) 


l=1 i=l 


This is also a myopic polynomial heuristic. There exist two heuristics for the solution 
of the problem in [18]. 
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4 The Pegging Assumption 


The output rate variation problem is NP-hard combinatorial problem. However, this 
can be solved in pseudo-polynomial time under the pegging assumption, which sep- 
arates each part at the lower production levels into distinct groups for each product 
into which that part will be assembled. The pegging process reduces the ORVP into 
weighted case of the product rate variation problem [8, 23]. 

In the pegging assumption, parts of output / at level/,/ = 2,3, ..., L, are dedicated 
or pegged to be assembled into the particular model at level 1. The parts dedicated 
to be assembled into the different models are distinct in pegging i.e., h ¢ p implies 
tin F tip for each output i at level /. Pegging is useful for high quality model pro- 
duction because high quality parts are required for high quality model and such parts 
can be used under this assumption. The mathematical formulation of pegging in a 
JIT production environment has been firstly developed in [8], where some heuristic 
procedures for the pegged multi-level min-sum model are also presented. 

The sequencing model of the pegged ORVP with absolute deviation objective 
[23] is to minimize the following weighted deviation: 


min maxniig {Wat ark — krail » Wit \Xaiktiin — Ktiunrnl} . (17) 


where h = 1,2,...,m;1=1,2,...,m5k =1,2,...,Di);1=2,3,...,L subject 
to the constraints in single-level case [27]. 

For /=1,2,...,Z, tim = 1 if i=h and 0 otherwise, the objective function 
is reduced to min maxy,i7,.4 {Wi (tin) |Xiik — Kris|}. With w* = max;, {wir (tin)}, the 
pegged ORVP is transformed into the following formulation: 


min max; x W? |xix — kri| (18) 


with constraints in single-level case [27]. 

Clearly, this is the weighted product rate variation problem formulation [23]. 
The pegged ORVP with total deviation objective can analogously be reduced to a 
weighted PRVP with total deviation objective [23]. The optimal schedules for the 
weighted PRVP with total deviation objective can be obtained using the assignment 
approach [13-15]. 


5 Dynamic Programming Solution 


The efficient algorithms for the solutions of ORVP are unlikely to exist due to the NP- 
hardness of the problem. Nevertheless, the dynamic programming (DP) procedure 
gives rise to optimal solutions [16] for small number of products. The DP algorithm 
has been applied for the problem with the objective that simultaneously minimizes 
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the variability in the usage of parts and smooths the workload in the final assembly 
process [19]. 

The DP procedure for ORVP is developed in [16], which is polynomial in D, 
and consequently seems to be effective for small number of products ny even when 
the total product demand D, is large. During the enumeration process, an excessive 
amount of time or space is reduced by using some fast heuristic as a filter which 
eliminates any states from DP’s state space that would lead to no optimality. Two 
myopic heuristics to generate the filter are proposed in [16]. If the heuristics yield 
near-optimal sequences, then the state space size can be reduced. 

The weighted case of output rate variation problem with the two sequencing 

D, L nh 

objective functions minmax;, «Wit |Xu — Yeri| and min>? >, wi (Xilk — Vik rit)” 
‘=1 [=1 i=l 

can be concisely transformed into the matrix feiresedtanon anid can be implemented 


the transformation for the solution of ORVP using DP procedure [16]. 


First we consider the min-max objective function max; ) .wy |Xin — Yier| and 
L 


denote the deviation matrix J” = [vit] sean with n = Sin where yj, represents the 


l=1 
I-1 


mn + 1 }th row and pth column element. 


m=1 
Now we have, 


max;).Wit |Xik — YieTil 


n\ ny 


= max; > Wi (tipXpik — Til > tipXpik) 


p=1 i=1 


ny 


ny 
= max;1,4 > Wi (tip — Tit > lip )Xp1k 
=] 


p=1 


ny 


= max;,) % » Vilp%pik 
p=! 


ny 
where Yilp = Wit Cilp — ty ip): 
i=l 
i : 
Let the column vector X; = (x11k X21k, +++ 5Xn, 1k) to be the cumulative produc- 
tion at level 1 during the time period | through k. Hence the objective function 
Zo ax = Min max; x Wi |X — YRri| at the time unit k over all parts, is transformed 


into matrix representation as follows: 


minimize max wy |Xi — Yerii| = min max WLXulh » 


1,t, 
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where the norm ||.||; is defined as 


maximum _ |la||; = max;|a;|, i= 1,2,...,n, 
for a vector a = (a, d2,..., dy). 
Let the demand vector at level 1 bed = (d 1,do,..., dn.) and the states in a sched- 
ny 
ule be X = (x1 oS, eee Su) with cardinality |X| = x where x; is the cumulative 


i=l 
production of model i, x; < d;. Let e; be the unit vector with n, entries all of which 
are zero except for a single | in the ith row. 

Define ¢ (X) to be the minimum of the maximum absolute deviation for all parts 
and models over all partial schedules of X and || /"X'||; is the maximum of the deviation 
of actual production from the ideal production over all parts and models when X is 
the amount of model produced. 

The DP recursion for ¢ (X) is as follows [16]: 


(YD) = o(X : X =0) = 0, 


p(X) = min {max {¢ (X — e;), || X]|,} 27 = 1,...,m, x1 > 1}. 


For any state X, it is observed that ¢ (X) > 0 and ||I’ (X : X =d)||, =0. 
D, L ny 

Now we consider the objective function > > wi (Xie — Yurit)?- 
k=1 [=1 i=l 

That is, 


D, L ny 
2 
> > > wir (Xi — Yieit) 
k=l [=1 i=! 
n 


dD, G ny 
2 
= > > > Wit (titpXpik — Fil > lilpXpik) 
i=1 


k=1 l=1 i=1 


dD, 
= >) (OXill2)’, 


k=1 


1 n) 
where Q = w; dip, Oilp = tip — te Ris 
i=l 
The euclidean norm ||.||, is defined as |lal|) = ,/>°/_, a? for a vector a= (a, 
heey) 
Let @ (X) to be the minimum of the total square deviation respectively for all 
parts and models over all partial schedules of X. The term (|| QX; ey is the sum of 
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square deviations of actual production from the ideal production over all parts and 
models when X is the amount of model produced. 
The DP recursion for @ (X) is as follows [16]: 


&(D) = O(X :X =0) =0, 


P(X) = min {H (X — e) + (]OXll2)” i= 1,2,...,m, xi = I}. 


It is always true that &(X) > 0 and ®(X : X = d) = 0 for any state X. In any 


state of X, x; can have any of the values 0, 1, ..., d;. The number of states in the DP 
ny 


recursion is I] (d; + 1). 


i=1 
Any state X can be generated from n; states. The computation time for || /"X||, 
L 


or (|| QX||2)° is O (nin), n = > fs The space and time complexities of the DP 


m=1 


ny ny 
procedures are (Tu + ») and o( mofo + ») respectively [19]. 


i=l i=1 


The number of feasible schedules for any problem instance is 


Di! : 
Tid, This 
is considerably larger than the number of states in the DP recursion. The inequality 


ny 
[]@+n<(FA*" 
ber of products even with large copies. 

An excessive amount of time or that of space can be reduced by using some 
fast heuristics as a filter. The filter eliminates any states from DP’s state space 
that would lead to no optimality. Two myopic heuristics exist for generating the 
filter [16]. One of the two heuristics shows that model i becomes next model to 
be scheduled if that minimizes ||/"(X + e;)||, and the other shows to minimize 
max {|| °(X + e;)||; , min; | P(X + e; + e))||,}- 

The DP algorithm progresses through the state space in the forward direction of 
increasing the cardinality as the procedure generates all states X with |X| = k before 
IX} =k+1,k=1,2,...,D, [16]. 

It is noteworthy that the output rate variation problem with a commutative aggre- 
gation function that aggregates deviations over all production cycles which is known 
as the symmetric output rate variation problem has been solved by the dynamic 
programming procedure [5]. 


ny 
) shows that the DP algorithm is effective for small num- 
ny 


6 Conclusion 


The mixed-model just-in-time sequencing problem has been widely studied with 
various mathematical formulations and solution strategies. However, it is still a chal- 
lenging area due to its interesting base model of theoretical value and wide real-world 
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applications. The PRVP is solvable in pseudo-polynomial time, but the ORVP is NP- 
hard. The problem whether cyclic sequences are optimal for ORVP also remains 
open. The input-output matrix analysis could be another approach to deal the multi- 
level problem. The simultaneous study of production and logistics is a challenging 
area having many research issues [30]. Our further work will be focused on synchro- 
nized study of production and logistics to balance overall supply chain systems. It is 
thus, this paper not only provides the review of existing literature but also opens the 
floor to be worked forward. 
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L?-Boundedness of Two Singular Integral 
Operators of Convolution Type 


Sten Kaijser and John Musonda 


Abstract We investigate boundedness properties of two singular integral operators 
defined on L?-spaces (1 < p < oo) on the real line, both as convolution operators 
on L?(R) and on the spaces L?(w), where w(x) = 1/(2 cosh at): It is proved that 
both operators are bounded on these spaces and estimates of the norms are obtained. 
This is achieved by first proving boundedness for p = 2 and weak boundedness for 
p = 1, and then using interpolation to obtain boundedness for 1 < p < 2. To obtain 
boundedness also for 2 < p < ov, we use duality in the translation invariant case, 
while the weighted case is partly based on the expositions on the conjugate function 
operator in (M. Riesz, Mathematische Zeitschrift, 27, 218-244, 1928) [7]. 


Keywords Convolution operators - Sech (function) - Hilbert transform - Hardy 
space - Weak type estimates 


1 Introduction 


In [4, 5], three systems of orthogonal polynomials belonging to the class of Meixner— 
Pollaczek polynomials were described together with some operators connecting 
them. The first system was the special case of the Meixner—Pollaczek polynomials 
[3, 6] with parameter A = 1/2, a system that can also be described as the orthogonal 
polynomials obtained from the weight function w(x) = 1/(2 cosh x). The Second 
system was a limiting case of the Meixner—Pollaczek polynomials with the parameter 
A tending to 0. That system could also be described as the polynomials orthogonal 
in the strip S = {z € C:|Imz| < 1} with respect to the Poisson measure for the 
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origin. In the third system, the polynomials were orthogonal with respect to the 
weight w(x) = @ * w(x) = x/(2 sinh 5x). 

These systems are connected by two operators R and J (see [1] or [2]) mapping 
functions in the strip S to functions on the real line R, defined by 


fat+it+ f(x —-i) f(x +i) -— f@ —-i) 


Rf (x) = 5 and Jf (x) = 5 


Besides these two operators, the operators B = R~' and S = JR~! turn out to 
have interesting properties with respect to these systems of polynomials. Both oper- 
ators can be represented as convolution operators 


Bf(z) = [- Cla and Sf(x) = lim fat 


oo 2cosh F(z — t) s+0 Jir-t|>e 2sinh F(x — t)’ 
leading to the Fourier transforms 
Bf (t) =secht f(t) and Sf(t) =itanhr f(s). 


These two operators can be studied in the context of either real or complex analysis, 
and in this paper we consider the operator B as an operator from functions on the 
real line R to functions in the strip S, while the operator S is studied as an operator 
on functions on R. Function spaces on R are denoted by L and those on S by H. 
For an arbitrary non-negative and locally integrable function w on R, L? (@) denotes 
measurable functions on R with 


I FllPoce = FG)? (2) dx < 00, 


and H?(q) analytic functions on S with 


oe) 
IF oboe) = sup If (x + ia)|P@(x) dx < oo. 


—i<a< CO 
Furthermore, L?(R) = L?(1) and H?(S) = H?(1). Unless stated otherwise, we 
assume throughout that | < p < 00, F = Bf and w(x) = 1/(2 cosh 5x). 
We investigate boundedness properties of B and S, both as convolution operators 


in the translation invariant case and for the weight w(x) = 1/(2 cosh ae): Our main 
results are the following. 


Theorem 1 For 1 < p < &, the operator B is linear and bounded from 
(a) L?(R) to H?(R), 
(b) L?(@) to H?(w). 


Theorem 2 For 1 < p < &, the operator S is linear and bounded on the spaces 
L?(R) and L?(w). 
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For p = 2, both of these results were proved in [5]. In this paper we first prove 
weak boundedness for p = 1, and then use interpolation to obtain boundedness 
for 1 < p < 2. To obtain boundedness also for 2 < p < oo, we use duality in the 
translation invariant case and the method of M. Riesz [7] in the weighted case. 


2 Weak Boundedness for p = 1 


We denote the Lebesgue measure of a measurable set E of real numbers by |F| if 
given by dx, and by|E|,, if given by w(x) dx. We further write Ao(S) to denote the 
space of functions f that are analytic in the strip S, continuous on the closed strip S 
and such that | f| — 0 when |z| > ov. 

We shall prove the following result which, perhaps, is interesting in its own right. 


Proposition 1 Let i > 0 and E, = {x : |Bf(x +i)| > A}. If f € L'(R), then 


16 
|E)| < x Fle , 


and if f € L\(@), then 


|Falo S — x Seles: 


Corollary 1 Let) > 0 and ES = {x : |Sf(x)| > A}. If f € L'(R), then 


ES < Flac: 


and if f € L'(@), then 
16 
S 
[ES lo < < | Flare): 
The main idea is to first consider the case when f is positive, and we have then 
the following. 


Lemma 1 Let f be a continuously differentiable function on R with compact sup- 
port, X. > Oand E, = {x :|Bf(x £1)| > A}. If f is such that 


i f(x) dx = 1, 


(oe) 


then|E,| < 2/2, and if f is such that 


a f(x)o(x) dx =1, 


then|E)|,. < 2/n. 
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Proof The first part of the proof is the same for both assertions, and we start by 
denoting 
* f(t)dt 


oo 200sh F(z —#) 


RFQ) = Bf) = FO = | 


We observe that F is analytic in S, and using partial integration, we see that F' is 
continuous on the closed strip and that F € Ao(S). It is obvious that F(z) = F(z) 
and that F is real-valued on R. Furthermore, we have 


1 cosh(x — iy) cosh x cos y — i sinh x sin y 


cosh(x + iy) | cosh(x + iy)|? cosh 2x + cos 2y 


’ 


and this implies that Re(F(z)) > 0 inS. Let further for a given A > 0, 


Zz 
Pilz) = Pa 

It is easy to see that ~, maps the real line to itself, leaves the origin fixed and maps 
oo to |. This implies that the imaginary axis is mapped to the circle |z — 1/2] = 1/2. 
It is also clear that the circle |z| = A is mapped to the line Re(z) = 1/2 so that 
the right half-plane is mapped to the interior of the circle |z — 1/2] = 1/2 and that 
|z| > A implies that Re(g,(z)) > 1/2. 

We now consider the first assertion of the lemma, and for this, we use Cauchy’s 
theorem. We observe therefore that for all —1 < a < 1, we have 


‘a Ftiadr= | F(x)dx =1. 


(oe) (oe) 


The next step is to see that 


Gia) wegdecd !O g.6) T pea 
[. ©) va | ol ©) v= pad vez f eS 


It follows again from Cauchy’s theorem that also 


[- G(x idx f- G(x) dx < : 


(oe) (oe) 


so that by Chebyshev’s inequality, 


|{x : Re(G(x +i)) > 1/2}| < a = 


ans) 
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However, as observed above, 


{x : Re(G(x £1)) > 1/2} = {x : |F@ +) >A} = Ey, 


and this proves the first assertion. 

To prove the second assertion, we need the fact that the measure w(x) dx is closely 
related to the Poisson measure for S; see [8]. In fact, we have for u harmonic in S, 
continuous on S and such that |u(x)| < e“'*! for some a, 0 <a < 1/2 that 


u(O) = i: call 5 nes tse) dx = l- Ru(x)w(x) dx. 


Applying this to the function F,, we see that 


F(O) = i RR f (x)w(x) dx = a f(x)o(x) dx = 1. 


We next observe that 


G(0) = F(O) _ 1 
FO)+A +A 


Since G(x —i) = G(x + 1), it follows that 


G(0) = i RG(x)w(x) dx = /. Re(G(x + i))@(x) dx = {a3 


Therefore, by the same argument as before, it follows that |F|,. < 2/A, and this 
proves the lemma. 


If f is real-valued, then 


Bf (x +i) = f(x) £iSf(), 


so the same conclusion holds a fortiori for the set E° = {x : |Sf(x)| > A}. 
We can now prove the proposition. 


Proof of Proposition | We prove only the first assertion since the proof of the 
second assertion is exactly the same. We first consider the case when / is real- 
valued. We write ft = max(f,0), f~ = max(—/f,0) and hence f = ft — f-. 
We observe that in order to have |Bf| > A, we have to have either |Bf*| > 4/2 or 
|Bf—| > A/2, and since || f || = || ft|| + || f7 ||, this implies that |£,| < 4/4. If f is 
complex-valued, we write f = g + ih and essentially the same argument as above 
implies that 
16 
[Fil s IF 
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and this proves the proposition (under the assumption that f is continuously differ- 
entiable with compact support, but since such functions are dense in L', the same 
holds by continuity for all functions in L'). Again the same result holds for the 
operator S. 


3 The Case 1 < p <2 


In [5], it was proved that the operator B is bounded from L? to H? with norm less 
than or equal to 2 (both in the translation invariant case and for the weight w), and 
that S is bounded on L? with norm 1 (again in both cases). Using the Marcinkiewicz 
interpolation theorem, we can now prove the following. 


Lemma 2 The operators B and S are strongly bounded for 1 < p < 2 with norm 
at most 
16p 


(p—1)(2— p) 


Proof This follows immediately from the Marcinkiewicz interpolation theorem. 


Choosing p = 4/3, we see that for T = B or T = S, we have 
IT Ils < 96. 


We can now use the Riesz—Thorin theorem [9] for 4/3 < p < 2 to obtain the 
following result. 


Proposition 2 Let 1 < p < 2. 


(a) The operator S is bounded as an operator from L? to H? with norm at most 


*(G=ne=n) 
nn | ——...— 
(p—1)@2— p) 


for 1 < p < 4/3, and at most 
96°?) 


where 0(p) = 4/p — 2 for 4/3 < p < 2. 
(b) The operator B is bounded as an operator from L? to H” with norm at most 


*(G=ne=n) 
nn | =. 
(p—1)2— p) 


for 1 < p < 4/3 and at most 


96°) 7 21-4) 


where 0(p) = 4/p — 2 for 4/3 < p <2. 
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Proof For 1 < p < 4/3, this was the preceding lemma. For 4/3 < p < 2, it follows 
from the Riesz—Thorin theorem. O 


4 The Case 2 < P <oo 


We first observe that in the translation invariant case the operator S is self-adjoint so 
that by duality we have immediately the following result. 


Proposition 3 (a) The operator S is bounded on the space L?(R) with norm at 
most 24p for 2 < p <4, and with norm at most 96°”) where 0(p) = 2 — 4/p 
for4d<p<m. 

(b) The operator B is bounded as an operator from L? (IR) to H? (S) with norm at 
most 1 + ||S|lp. 


Proof (a) follows from duality while (b) follows from the fact that on the boundary 
of S, we have Bf = f +iSf so that ||Bf || < || FI] + SFI. 


To prove boundedness also on the space L’”(w) for 2 < p < oo, we use the same 
idea that M. Riesz used when proving boundedness of the conjugate function operator, 
i.e. by considering even powers. 


Proposition 4 Let f € L*" be real-valued and such that \\ f \\ p2»() = 1. Then 


Sf lee < elif 
L?"(w) —= log 2 L2"(@)- 


Proof Let F = Bf, then 
F(0)" = qe R(F(x)*")@(x) dx = :. Re(f (x) + iSf (x))""w(x) dx. 
Denoting Sf by g, we have 


n 


2 
Re(f +iSf)" = Re(f tig)" =>° ed Gira a, 
k=0 
and this implies that 


0< FO)" = / ° > io) (-DS fay *e(x)"*w(x) dx < 1. 


© k=0 
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Writing X for ||gl2., we see that 
oo Nl 
09 k=1 


From Holder’s inequality, It follows that 


2n-1 


xn < 2d & \x = (X + 1? —_ xX 


and therefore 2X2” < (X + 1)?” so that 2!/2"xX < X + Lor (2!/" — 1)X < 1.Since 
(2!/2" _ 1) > log 2/2n, it follows that X < 2n/log2. 


Remark 1 A more careful analysis at the binomial sum shows that it is actually 
possible to have at least 
ak ey, 


which shows that the denominator log 2 can easily be replaced by log 3. 


Remark 2 Using Cauchy’s theorem, essentially the same idea can also be used in 
the translation invariant case. One observes that if f is real-valued and || f|l2, = 1, 
then ||o * f lla. < 1, and since 


7 “(F@)+i8f@)" dx = / "as @N ari, 


we see again that ||S/f||2, < 2n/log 2. If we do not assume f/f to be real-valued, it 
follows that 


Sf lan < ies og l fll. 


Using the Riesz—Thorin theorem, we see that for 2 < p < oo, 


ISfllp < ine —ollflle 


so that ||S||p < 4p/log 2. 


Remark 3 In the translation invariant case, we can now use duality to move from 
2 < p< ootol < p < 2 and thus obtain the better estimate that 


/ 


P 
log2 


IISllp S 


where p’ = p/(p — 1). 
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Finally as above, we have that for all 1 < p < oo, 
Bilp 1+ 1Sllp. 
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Fractional-Wavelet Analysis of Positive 
definite Distributions and Wavelets on Y’(C) 


Emanuel Guariglia and Sergei Silvestrov 


Abstract In this chapter we describe a wavelet expansion theory for positive definite 
distributions over the real line and define a fractional derivative operator for complex 
functions in the distribution sense. In order to obtain a characterization of the complex 
fractional derivative through the distribution theory, the Ortigueira-Caputo fractional 
derivative operator cD®* [13] is rewritten as a convolution product according to the 
fractional calculus of real distributions [8]. In particular, the fractional derivative of 
the Gabor—Morlet wavelet is computed together with its plots and main properties. 


Keywords Wavelet basis - Positive definite distribution - Complex fractional deriv- 
ative - Gabor—Morlet wavelet 


1 Introduction 


In recent years, wavelet analysis and fractional calculus have shown to be a powerful 
tool in several areas of mathematics. Indeed, the time-frequency localization property 
provided by the wavelet approach gives the possibility to use a wavelet basis as a 
mathematical microscope in order to better investigate the behavior of a function by 
the well-known Heisenberg box [18] located in the time-frequency plane. 

Wavelet expansions are used to characterize different function spaces, such as L’- 
spaces, Sobolev spaces, Morrey—Campanato spaces, etc. [19]. In particular, several 
key concepts of wavelet analysis, such as the wavelet transform, can be extended to 
the space of tempered distributions .”’(R). 
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In this chapter, a wavelet expansion for the family of positive definite distribu- 
tions is presented. It has many applications in different areas of both pure and applied 
mathematics (Lie groups, maximum entropy methods, etc.) [7, 14] and can be gen- 
eralized for the class of tempered distributions [16]. Furthermore, an open problem 
about the reconstruction formula for Shannon wavelets in the distribution sense is 
proposed. 

In [3, 4] a generalization to complex functions of the distribution theory is pre- 
sented. In particular, these two papers provide a generalization of the classical Dirac 
delta in the complex plane which gives the possibility to rewrite the complex frac- 
tional derivative in the distribution sense. 

Indeed, the fractional derivative of complex functions is provided by the Ortigue- 
ira-Caputo fractional derivative cD* [13], which can be written as a convolution of 
the given complex function with a suitable function that defines a regular distribution 
on C (see (30) and (31)). 

The authors have computed the fractional derivative of the Gabor—Morlet wavelet 
through the Ortigueira-Caputo operator. It represents a wavelet family with several 
applications in signal theory and geophysics. 

This chapter is organized as follows: some preliminaries and notations on function 
spaces and wavelet analysis are provided in Sect.2. A wavelet expansion for the 
positive definite distributions is shown in Sect.3. The fractional differentiation in 
the complex plane, together with the complex-variable distribution theory, is given 
in the first part of the Sect.4, while in the second part it is widely explained how 
the Ortigueira-Caputo fractional derivative can be rewritten in the distribution sense. 
The fractional derivative of the Gabor—Morlet wavelet is computed in Sect. 5. 


2 Preliminaries and Notations 


In this chapter, n will denote an element of No = N U {0}, and i the imaginary unit. 
The fractional and the integer parts of a real number x will be indicated by { x } and 
| x |, respectively. The Heaviside step function u and the sinc function are defined, 
respectively, by 


1, x>0, 
a ia ( x <0 Mm 
; sin(x)  e* — ei 
cs UX = 2imx 2) 


The L?-inner product for complex-valued functions on an interval [a, b] is given 
by 
b 


(2) = i f (x)g(@a)dx, (3) 


a 
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where the Dirac bracket notation is adopted (in order to emphasize the action of any 
tempered distribution on a test function). Suppose that f, g € L;,,(R), the convolu- 
tion of f and g is defined by 


(f * g)(x) = [roe — t)dr, (4) 


Le. (f * g)(x) = (ro), g(-—(t - 1). The dual of a normed space V (F) will be indi- 
cated with V’(F). 


2.1 Space Functions, Orthogonal Wavelets and Wavelet 
Transform of Distributions 


The distributions over R are defined as the dual space of the test functions A(R), 
namely Y’(R) is the space of all linear and continuous functionals on the space of the 
test functions. Similarly, the tempered distribution is the dual of the Schwartz space 
-Y (IR). In other words, (IR) is the space of all linear and continuous functionals 
on .“(R), namely the set of all functions 


f:7(R)- F, 


where F is usually R or C [17]. In this chapter it will always be taken F = C. 

The space of highly time-frequency localized functions over R is denoted by 
-% (IR) and defined as the space of the functions ¢ € .“(R) such that all the moments 
vanish, namely 


[oe] 


[ voc = 0, Vn € No, (5) 


—oo 


where the topology on this function space is defined in the classical way [11]. 
Let yy € -“(R) be a wavelet mother and let 


Asa (=) 
Var = Ev me 


be its correspondent daughter wavelet [18]. The wavelet transform of f € 7’(R) 
with respect to y is given by 


1 i t—b 
Wy f(a, b) = (FO, Wao) = We [40 wv (—)ar, 
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in which (a, b) € Ryo x R since a and b correspond to the scale factor and time 
shift, respectively. It is clear that this integral transform represents a @°°-function 
on the half-plane R.» x R. 

In [11] it is shown that Wy : 7 (R) + -/(R) is a continuous linear map. 
Naturally, since w € .“(R) the wavelet transform above can be also defined for 
fe AR) (1. 

Some concepts about the theory of orthonormal wavelet bases of L”(R) are briefly 
recalled below [5]. An orthonormal wavelet on R is a function w € L7(R) such that 
the family {Vina} nneZ represents an orthonormal basis of L?(R), where Win.n(X) = 
2/2 (2"x — m) with m,n € Z. 

The reconstruction formula for orthogonal wavelets claims that every f € L*(R) 


can be written as 
.= = >, (f. Yin.n) Wnn im |-llo. (6) 


meZ neZ 


The series in (6) is often called wavelet series of f. In the literature, the wavelet 
coefficients of f with respect to w are denoted by ¢,,,,(f), ie. 


Cm.n(f) = (f, Wm.n)12¢@) = [re Winn (x) dx. (7) 


The notation c,,,,(f) does not provide information about the particular family of 
wavelets chosen. In order to take into account that the coefficients can refer to the 
wavelet yy (instead of yy), in this chapter the symbol c’*, ,, (f) will be used. The wavelet 


coefficients C,,,(f) and the wavelet transform of f are linked [16] by 
Cnan(f) = 2 Wy f (12,2). (8) 


In this chapter we are interested in wavelet expansions of positive distributions, 
i.e. tempered distributions (see Schwartz theorem in the next subsection), hence we 
need the orthonormal wavelets have to belong to .“(R). In [10] it is shown that an 
orthonormal wavelet from .“(R) is an element of .%)(IR). The existence of these 
wavelets, the construction of orthonormal wavelets w € (IR) such that v € J(R) 
and the corresponding multidimensional wavelets can be found in [12]. 


2.2 Tempered and Positive Definite Distributions 


In the distribution theory it is convenient to have the regularizing functions ¢ as pos- 
itive definite test functions (rarely functions of positive type). In brief, this is realized 


by passing from the test function ¢ to ¢@ « ¢, where ¢(x) = $(—x). It is possible to 
assume that the regularizing function ¢ is a test function which is both even, positive 
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and definite positive with a never vanishing Fourier transform having the same prop- 
erties [6]. The classical Bochner’s theorem is an important result since it links this 
family of functions with the Radon measure, thus providing a full characterization 
of positive definite functions. 

A generalization of this mathematical concept can be provided via distribution 
theory. 


Definition 1 (Positive definite distribution) A distribution T on R such that T ( * 6) 
> 0, for every test function @, is called positive definite distribution (rarely of positive 
type). 

The definition above generalizes clearly the concept of positive function, i.e. every 
positive definite function is also a positive definite distribution. The given definition 
is not constructive and does not provide much information about the position of 
this family of distributions in the distribution theory. A generalization of Bochner’s 
theorem, as derived by Schwartz, goes in this direction. 


Theorem 1 (Schwartz) A distribution T on R is definite positive if and only if T € 
-S'(R) and its Fourier transform is a positive Radon measure. 


Proof It follows from the Bochner’s theorem (see [6]). oO 


Three remarkable examples of positive definite distributions are the Dirac impulse 6, 
the Cantor measure on the Cantor set and the distribution associated to the Poisson 
summation formula (see [6]). 


3 Wavelet Expansions of Positive Definite Distributions 


In this section, a theorem concerning the wavelet decomposition in the function space 
-%(R) is presented in order to obtain an its suitable generalization for a tempered 
distribution. In the last subsection an interesting example is presented and discussed. 


3.1 Main Results About the Wavelet Decomposition in ./9(R) 


A brief summary of the wavelet expansion theory for the space .“o(R) is presented. 
The following statement provides different results with regard to this purpose. 


Theorem 2 (Wavelet expansion on .%(R)) Let ¢ € .%(R), f € “'o(R) and let 
w € So(R) be an orthonormal wavelet. Then 


o= » >. Cm,n ($) Winns (9) 


meZ neZ 


I. @ can be expressed as 


with convergence in .%o(R); 
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2. the wavelet expansion series of f, namely 


f = >» > Cm,n (f) Wns (10) 


meZ neZ 


converges in Sy (IR); 
3. 


(f, g) = > > Cmn(f) Cn P)s 


meZ neZ 


Proof The first property follows from the definition of wavelet coefficient provided 
in (8). The second and third properties are direct consequences of (9) (for more 
details see [16]). oO 


The previous theorem shows the convergence of the wavelet series on the function 
spaces .% (IR) and .7’9 (IR). In the next subsection we will try to extend these results 
to the case of a tempered distribution /. 


3.2 Wavelet Decomposition for Positive Definite Distributions 


In Sect. 2.1 the functional space .“(R) has been defined as the space of the functions 
@ € (R) such that all the moments vanish, namely (5) holds. Therefore .“)(R) is 
a closed subspace of .“(R). From the Theorem 2 the following convergence result 
for the wavelet expansions of positive definite distributions follows. 


Theorem 3 Let ¢ € .%(R), f € A(R) and let w € A(R) be an orthonormal 
wavelet. Under these hypotheses it is 


Coe > tnalG.e). (11) 


meZ neZ 


Proof Since ¢ € .%(R), it follows that the product (f, @) is an element of .%(R). 
This means that even if the property 2 of the Theorem 2 does not hold, the left-hand 
side of (11) belonging to .%)(R) exists in the sense of the Theorem 2. By using (9), 
and taking into account that @ € .%(R), we get 


& = »s >, Cm,n (2) Winns 
meZ neZ 


therefore 


(f, op) = >, Cm.n (2) (f, Winn) = bey Cnn) (Winns op). 


meZ neZ meZ neZ 


The proof follows directly by recalling the meaning of cf, , (see Sect. 2.1). Oo 


m,n 
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If all the hypotheses of the previous theorem are satisfied and the Fourier transform 
of f is a positive Radon measure, according to the Schwartz theorem (see Sect. 2.2) f 
is nothing but a positive definite distribution. Therefore the thesis of the Theorem 3 
holds for every positive definite distribution. 


3.3 Example and Open Problem 


Let us consider the distribution associated to the Poisson summation formula [6], 
namely the distribution T given by 


T=f(@«)= S Ck 


k=—0o 


(12) 


where f € L}.(R) is a 27-periodic function, c, are its Fourier coefficients and the 
Fourier transform of f is defined by 


f#@)= = ii Fe ax, (13) 


In [6] it is shown that T is a positive definite distribution. In order to apply the 
Theorem 3 to this distribution, a wavelet family belonging to the function space 
-%(R) has to be chosen. The Shannon wavelet is defined [2] by 


Vous (X) = sine(x — 1/2) — 2 sinc(2x — 1), (14) 


Since its scaling function is simply given by @(x) = sinc(x) [2], this wavelet 
family satisfies the hypotheses of the Theorem 3. Hence (11) holds, namely 


= DED Cnn) Cnn): (15) 


meZ neZ 


where the coefficient cf, ,,(@) can be easily computed being (x) = sinc(x). A 
remarkable result [2] is that the reconstruction formula (6) for Shannon wavelets 
enables to compute the derivatives of f in terms of the wavelet decomposition, i.e. 


—f(x) = s ony, “90)+> s Bi 5 “vi (x), (16) 
h=—00 n=0 k=—00 


if f € L?(R) andf € C4 with q sufficiently high. In order to generalize the previous 
equation, given that in our case f is a tempered distribution, it should be written in 
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the distribution sense. At the state of the art, this is an open problem. Since (16) 
enables us to approximate a function and its derivatives, the proposed problem might 
provide interesting results. 


4 Fractional Calculus of Complex Functions 
in the distribution sense 


In the first part of this section, the generalization of the distribution theory to any 
complex-valued function of a complex variable is presented in order to lay the foun- 
dations for a fractional-wavelet analysis of complex functions in the distribution 
sense. In the second part, a definition of fractional derivative in the complex plane, 
which can be reinterpreted in terms of the distribution theory, is presented. 


4.1 A Complex-Variable Distribution Theory 


A generalization of the distribution theory to C now becomes necessary. Luckily, this 
topic is widely discussed in [4] and the basic properties of the complex generalized 
functions are summarized in the following statement below. 


Theorem 4 Let f, g,¢, W be complex-valued functions of a complex variable s, 
a,b €C, y € Ryo, and suppose that the inner products of (1)-(5) are defined and 
convergent over (0, 00). Ifw € D(C), d € D(C) and sy) € C : (S09) = 00, the fol- 
lowing properties hold true. 


1, 

(f(s), ab(s) + bw(s)) = a(f(s), b(s)) + D(s), W(s)), ee te 

| (F(5) + 815), (8) = FG), #69) + (86), 66), ene) 

2. 

(f(s 2 So), $(S)) 9(s)=0 = (0), oly + Sa) nopaoay* (shift in C) 
3. 

1 
F(V5), O(S)) (s)=0 = TF (ro. g ()} . (scaling) 
ly| HXO)=yo 
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4, 
ao+ico 
(f(s) * (5), B(S) yao -(ro>. | Bs — y) G(s) a ; 
ate RO)=E 
with y= © +iQ. (convolution) 
5: 


(derivation) 


(f(s), O(S)) = (-I)"F(5), 9 (5), 
(F'(s)a(s), P(S)) = —(F (9), 8'(s)b(9)). 
Proof These properties immediately follow from the definition of distribution (for 
further details, see [4]). oO 


The Dirac impulse 5 can be extended in the complex plane without any difficulties 
introducing the generalized Dirac impulse & by a definition based on its integral about 
the origin [3]. The € impulse (or generalized impulse) is a complex-valued function 
of the complex variable s = o + iw defined by 


ib (0), = 0, 
(E(5), O(5)) ao = k is 40, (17) 
Le. 
E(s) = 0, s #0, 
[ scone =1, otherwise, 
hence 
&(iw) = d(w). (18) 


Its main properties (linearity, scaling, convolution, etc.) immediately follow from 
the Theorem 4. Another important result is that the € impulse can be viewed as the 
limit of a progressively narrowing and increasing height sequence of functions as 
€ — 0, namely the same property which holds for 6 [3, 4]. In the classical theory, this 
sequence is represented by a family of rectangular or Gaussian windows, while it is 
provided by acylindrical sequence in the complex domain (see Fig. 1). The Gaussian 
sequence used to approximate the Dirac impulse is 


—x?/e 


1 
x) = ——e ; 
= 
while its complex generalization is simply given by 


est ye, (19) 


We(S) = re 
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(b) 


Fig. 1 3D plots of thew. (s=x+ iy)| for ¢ = 1 (a) and ¢ = 10~? (b). They provide a geometrical 
interpretation of (20) 


It is not hard to show [4] that 


&(s) = lim w-(s). (20) 
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Figure | illustrates how w, approximates € for ¢ close to 0 according to the pre- 
vious equation. 
Let 


1= | ei o094s, (21) 


1 


be the integral along a straight line / in the complex plane. If the origin does not 

belong to the line /, it can be described by s = sy + re!’, where so 40 and s 40 

for every so, r and @. Both in this case and if the impulse is a Gaussian sequence, 

I = 0 [4]. Hence, we have to consider only the case when the origin belongs to /. 
If w, is a Gaussian or cylinder sequence, since s = re’, it is [4] 


T= el / 5(r)o (rel) ar, 


therefore 


§ (re”) = 507) > E(iw) = 5). (22) 


The previous equation justifies the definition (18). This generalization of the 
distribution theory to C is suitable to several applications in Laplace, z transforms, 
as well as in differential and difference equations [3, 4]. 


4.2 The Complex Fractional Derivative Operator 
in the Distribution Sense 


The well-know convolution method, due to Schwartz [17], is based on the possibility 
to write the Riemann-Liouville fractional integral I* [15] with lower boundary a = 0 
as a convolution, i.e. 


rf(x) Z 


al 
A+ 


1 a-l a, 
rig | foe-o dt = f(x) * Fo)’ (23) 
a=0 


where the function : 
et 


x u(x), (24) 


Ol 
xy 


defines the regular distribution 
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(oe) 


(xf. ¢) = [ene Vp € A(R), (25) 


(0) 


fora > —1 (indeed, the integral above makes sense if i(a@) > —1) [8]. Formula (23) 
can be written as a convolution without the restriction a = 0 defining xf = x* u(x — 
a) instead of (24). Clearly (23) makes sense if the associated convolution is valid. 

Following the same approach provided for the real fractional calculus (distribution 
sense) [8], we can denote 


‘22 — (26) 
a x)= T'(a) ’ 
and hence (23) becomes 
f(x) =f) * ba(x). (27) 


Formula (27) shows that I* is suitable to define real fractional derivatives in 
the distribution sense. Moreover, ¢y(x) satisfies the semigroup property [8], 1.e. 
u(x) * by (X) = ba4n(x) if o and 7 have real parts greater than zero. 

Indeed, under these hypotheses it is 


x 


Per) ba@)*o,@) =zi eal = fy te-yr te. 


0 
By achange of variables y = xt, the right-hand side (RHS) becomes 


x 1 


oor'e — xt)" xdt=xftr! / t—! (1 — 1)! xdt = x$*""" Ba, n) 
0 0 
etn P@) Pn) 


= X4 resin Pa) Pn) barn), 


where B is the Euler beta function [1]. 

The close link between the Riemann—Liouville fractional integral and the Caputo 
fractional derivative [15] justifies the reason for which the distribution theory can be 
applied to the latter. 

Ortigueira’s generalization of the w-order Caputo fractional derivative cD* to the 
complex plane is defined [13] by 


lo) 
D°f(s) def el(a—8)(a—m) f™ (xe? +s) 
c ~~ I'(m = a) xa-mt+1 
0 


dx, (28) 
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where f is a complex-valued function of the complex variable s,m —1<a<me 
Zt and@ € [0, 27). This fractional operator is defined such that the integral and 
derivative signs can be exchanged [13], hence 


ioe) 
el(t—8)(a—m) dq” f (xe? +5) 


I'(m— a) ds” xem 
0 


FO) = 


el(a— Ya m) ce 


=— m—a—1 0 DK b 


iz i 


[0 Z)( cy (—1) dz’ - fe, aay 
0 


eit(a—m) di 


= I'(m — a) ds” 


I'(m—@) 


m—a—1 
m—a—1 Sy 
a) «st <= ia (17s): 


where the function s‘} = s—*—! 4(R(s)) u(S(s)) is the complex counterpart of 
(24). Indeed, the function s‘-°~ ' defines the regular distribution 


—a-—1 


(oe) 


(st, b(s)) = i s"“ p(s)ds, Vp € AC), (29) 

0 
in the sense of distribution theory for complex functions (see Theorem 4 in Sect. 4.1). 
This definition holds because m — a > 0 and makes sense if the associated convo- 


lution is valid. 
As in the real case, we can introduce the function 


m—a—1 


Sy 
Pm— a(s) = aw (30) 
hence 
Oa (f(s) * Pm—a(S)). (31) 


Therefore (3 1) is the complex counterpart of (27), namely it provides the fractional 
derivative on the function space Z’(C). 

In this case the semigroup property is not satisfied, i.e. @n—a(s) * Pn_g(s) F 
@m—atn—p(S) ifm—1<a<meZtandn—1<B<neZ*. 

Indeed, 
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(m= a) F' (n= B) bm—a(8) * bn—p = Se Oe SP 


(oe) 


= ler = zy} dz. (32) 


0 
By achange of variables z = sw, the RHS becomes 


(oe) 


oe) 
[ome _ sw)?8-1 sdw= ga ame [wma _ w)r-B-1 dw 
0 


0 
1 


¢ on / yo _ wy? 8-1 dw 


0 
= srr PF" Bim — a, n— B) = Pm — a) Pn — B) bm—atn-p- 


The semigroup property does not hold since in (28) the upper limit of integration 
is infinity, hence the last computation does not provide the Euler beta function. 


5 Fractional Derivative of Complex Wavelets 


The aim of this section is to show the power of the Ortigueira-Caputo fractional 
derivative by computing the fractional derivative of the Gabor—Morlet wavelet. 


5.1 Gabor—Morlet Wavelet 


The Gabor—Morlet wavelet (sometimes Gabor wavelet or complex Morlet wavelet) 
is a complex wavelet widely used in geophysical applications. It is given [18] by 


1 ae 
Wow (x) = Te eo’ hs @l2thex | (33) 
Ib 


; da std os 
In the current literature, a common choice is taking f, = 2 and f, = aa since the 
a 


value w, = 27 f. = 5 is often used in the applications. Hence, we get 


1 —x?/2 ixw, 
Woy (x) => Jin. ‘. er. (34) 


Its fractional derivative is provided by the following statement. 
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(a) 


(b) 


Fig. 2 Real part (solid curve) and imaginary part (dashed curve) for the a-order fractional derivative 


= 1.4 (b) 


rlet wavelet with a = 0.4 (a) and a 


of the Gabor—Mo: 
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Theorem 5 Lets € C such that X(s) =Oand letm—1<a<meéZ". The a- 
order fractional derivative of the Gabor—Morlet wavelet is given by 


Ge (sy = eT ea? ar 7(8). (35) 


Proof It is easy to show this property with a direct computation following the same 
approach given in [9] for the Riemann ¢ function. Indeed, since 


fore) 
el 4; 1 Bie ok = 
fe +ix)o-, a dy =e i6(m—a) @lX@c (—a,)* m r(m = a), 
ye 

0 


we get 


ml 


° 1 i2na ,—x* a-—m SO, i2m{a a 
Yo ° ~ oo ~ oe a ° MS ds” (e ‘) =< . of We (s). (36) 
IU 


oO 


This theorem shows that the w-order fractional derivative of the Gabor—Morlet 
wavelet is nothing more than the product of the same wavelet y,,,(s = ix) and the 
complex factor e77{%) @*, 

Therefore the real and imaginary parts of this fractional derivative still belong to 
the family of Gabor—Morlet wavelets, as shown in Fig. 2. 


6 Conclusion 


In this chapter, a wavelet expansion of positive definite distributions is provided. 
The Ortigueira-Caputo fractional operator, which provides the fractional derivative 
of a complex function, is rewritten in the distribution sense. The fractional derivative 
of the Gabor—Morlet wavelet is computed. An open problem, concerning the pos- 
sibility to generalize the reconstruction formula for Shannon wavelets through the 
distribution theory, is proposed. This fractional derivative operator has already given 
some results in analytic number theory (see for instance [9]) and could be able to 
describe different physical phenomena, while the proposed wavelet expansion could 
be applied to other distribution families in order to obtain another generalization of 
the Theorem 2. 
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Linear Classification of Data with Support 
Vector Machines and Generalized Support 
Vector Machines 


Talat Nazir, Xiaomin Qi and Sergei Silvestrov 


Abstract In this paper, we study the support vector machine and introduced the 
notion of generalized support vector machine for classification of data. We show that 
the problem of generalized support vector machine is equivalent to the problem of 
generalized variational inequality and establish various results for the existence of 
solutions. Moreover, we provide various examples to support our results. 


Keywords Linear classification - Support vector machine - Generalized support 
vector machine - Kernel function 


1 Support Vector Machine 


Over the last decade, support vector machines (SVMs) [2, 3, 13, 14, 18] have been 
revealed as a powerful and important tool for pattern classification and regression. It 
has been used in various applications such as text classification [5], facial expression 
recognition [9], gene analysis [4] and many others [1, 6-8, 10-12, 15, 19-22]. 
Recently, Wang et al. [16] presented SVM based fault classifier design for a water 
level control system. They also studied the SVM classifier based fault diagnosis for 
a water level process [17]. 


T. Nazir (BX) - X. Qi- S. Silvestrov 

Division of Applied Mathematics, School of Education, 

Culture and Communication, Milardalen University, Box 883, 721 23 Vasteras, Sweden 
e-mail: talat@ciit.net.pk 


X. Qi 


e-mail: xiaomin.qi@mdh.se 


S. Silvestrov 
e-mail: sergei.silvestrov @ mdh.se 


T. Nazir 
Department of Mathematics, COMSATS Institute of Information Technology, 
Abbottabad 22060, Pakistan 


© Springer International Publishing Switzerland 2016 355 
S. Silvestrov and M. Ranéié (eds.), Engineering Mathematics II, 

Springer Proceedings in Mathematics & Statistics 179, 

DOI 10.1007/978-3-319-42105-6_17 


356 T. Nazir et al. 


For the standard support vector classification (SVC), the basic idea is to find 
the optimal separating hyperplane between the positive and negative examples. The 
optimal hyperplane may be obtained by maximizing the margin between two parallel 
hyperplanes, which involves the minimization of a quadratic programming problem. 

Support Vector Machines are based on the concept of decision planes that define 
decision boundaries. A decision plane is one that separates between a set of objects 
having different class memberships. 

Support Vector Machines can be thought of as a method for constructing a special 
kind of rule, called a linear classifier, in a way that produces classifiers with theoretical 
guarantees of good predictive performance (the quality of classification on unseen 
data). 

In this paper, we study the problems of support vector machine and define general- 
ized support vector machine. We also show the sufficient conditions for the existence 
of solutions for problems of generalized support vector machine. We also support 
our results with various examples. 

Thought this paper, by N, R, R” and R* we denote the set of all natural numbers, 
the set of all real numbers, the set of all n-tuples real numbers, the set of all n-tuples 
of nonnegative real numbers, respectively. 

Also, we consider ||-|| and < -, - > as Euclidean norm and usual inner product on 
R", respectively. 

Furthermore, for two vectors x, y € R”, we say that x < y if and only if x; < y; 
for alli € {1,2,...,}, where x; and y; are the components of x and y, respectively. 


2 Linear Classifiers 


Binary classification is frequently performed by using a function f : R” — R in 
the following way: the input x = (x;,...,.,) is assigned to the positive class if, 
Ff (x) = 0 and otherwise to the negative class. We consider the case where f (x) is 
a linear function of x, so that it can be written as 


f (x) = (w,x) +b = 


a > wix; + b, 
i=l 


where w € R", b € R are the parameters that control the function and the decision 
tule is given by sgn (f (x)). The learning methodology implies that these parameters 
must be learned from the data. 


Definition 1 We define the functional margin of an example (x;, y,) with respect 
to a hyperplane (w, b) to be the quantity 


Ve = Ye (CW, Xx) +d), 
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where y,; € {—1, 1}. Note that > 0 implies correct classification of (xz, y,). If we 
replace functional margin by geometric margin we obtain the equivalent quantity 


for the normalized linear function b) , which therefore measures the 


elses ile 
wll? wil 
Euclidean distances of the points from the decision boundary in the input space. 


Actually geometric margin can be written as 


Z 1 
v= 
II wll 
To find the hyperplane which has maximal geometric margin for a training set S 
means to find maximal 7. For convenience, we let y = 1, the objective function can 
be written as 
max ——. 
II wll 
Of course, there are some constraints for the optimization problem. According to 
the definition of margin, we have y,; ((w, xx) + b) > 1,k =1,...,/. We rewrite the 
equivalent formation of the objective function with the constraints as 


1 
min 5 Iwil?s.t. ye (Cw, x) +b) > 1k =1,...,1. 


We denote this problem by SVM. 


3 Generalized Support Vector Machines 


We replace w, b by W, B respectively. The control function F : R” — R” is defined 
as 
F(x) =W.x+ B, (1) 


where W € R’*", B € R" are the parameters of control function. 
Define 
ye = Ye (Wx, + B)>1 for k=1,2,...,1, (2) 


where y; € {(—1, —1,...,-1), , 1, ...., 1)} is n-dimensional vector. 


Definition 2 We define a map G : R" — R‘, by 
G (wi) = (Iwill, Iwill. ---> Ilwill) for 7=1,2,...,0, (3) 


where w; are the rows of W,,., fori = 1,2,...,n. 
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The problem is to find w; € R” that satisfy 


min G(w;) s.t. 7 > 0, (4) 
wicW 


where 7 = yz (Wx, + B) — 1. 


We call this problem a Generalized Support Vector Machine (GSVM). 
The GSVM is equivalent to 


find w;<¢W: (G’(wi),v—wi)>0 forall ve R" with n> 0, 
or more specifically 
find w;<¢W: (nG'(wi),v—w;)>0 forall veR". (5) 


Hence the problem of GSVM becomes the problem of generalized variational 
inequality. 


Example I Letus take the group of points of positive class (1, 0) , (0, 1) and negative 
class (—1, 0), (0, —1). 

First we use SVM to solve this problem to find the hyperplane < w,x > +b =0 
that separates these two kinds of points. Obviously, we know that the hyperplane is 
H which is shown in the Fig. 1. 


Fig. 1 Example | 
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For two positive points, we have 


1 
(wi, W2) lo | +b=1, 
0 
ony [] 4o= I, 
which implies 


wi+tb=1, 
Ww2+tb=1. 


For two negative points, we have 
( il. eset 
Wi ? w2 0 = ? 
0 
(w1, W2) Ei or b = -l, 
which implies that 


—w,+b=-—l, 
—w2+tb=-l. 


From the equations, we get w = (1, 1) and b = 0. The result is || w|| = V2. 


Now we apply GSVM for this data. For two positive points, we have 
Wit Wi2 | | 1 4 b}| _ [1 
W221 W22 0 by ~ Td]? 
Wil Wi12 0 Si by = 1 
w21 W2 } | 1 bo} | 1]? 
W111 by _ 1 W112 by _ 1 
bmd*LeJ= La] #@ [ee] +Le]=[1) 


For two negative points, we have 


[swe lto]+[aJ=[Ei]) 


which gives 
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and 


Thus 
min G (w;) = min {G (w) , G (w2)} = (V2, V2). 


Hence we get w = (1, 1) that minimizes G (w;) fori = 1, 2. 


Remark I The above example shows that we get the same result by applying any 
method SVM and GSVM. 


In the next example, we consider the two distinct groups of data, first solve both 
data for separate cases and then solve it for combined case for both methods SVM 
and GSVM. 


Example 2 Let us consider the three categories of data. 

Situation 1 Suppose that, we have data (1, 0), (0, 1) as positive class and data 
(—1/2, 0), (0, —1/2) as negative class shown in Fig. 2. 

Using SVM to solve this problem, we have 


nwa] to= 1, 


and 


(w1, W2) H +b=1, 


which implies 
wy tb=1 and w.+db=1. (8) 


For two negative points, we have 


(W1, W2) ea +b=-l, 
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Fig. 2 Example 2, 
situation | 


(0, -1/2) 


and 
0 


which gives 
Wi w2 
—-—+b=-1 d ——+b)=-1. 9 
5 + an 5 + (9) 


From (8) and (9), we get w = (4, 4) with b = =), where ||w|| = “22. 


For situation 2, we consider the data G, 0) and (0, 5) as positive class, data 
(—2, 0) and (0, —2) as negative class shown in Fig. 3. 
Using SVM to solve this problem, we have 


(W1, W2) bal +b=1, 


and 
0 


which implies 


1 1 
sMitb=1 and 52 +b= 1. (10) 


From the negative points, we have 


nwo 0 ]+e=-1 
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Fig. 3. Example 2, 
situation 2 


and 
0 
(W1, W2) al +b — =i; 


implies that 


—2w;+b=-1 and —2w.+b=-1. 


From (10) and (11), we get w = (4, 4) and b = 3 with ||w|| = 2. 
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(11) 


In the next situation 3, we combine these two groups of data. Now, we have data 
(1/2, 0), (0, 1/2), C1, 0), (0, 1) as positive class and (—1/2, 0), (0, —1/2), (—2, 0), 


(0, —2) as negative class shown in Fig. 4. 
Using SVM to solve this problem, we have 


(W1, W2) | +b=1, 


and 
0) 
(W1, Wa) Fa +b — 1, 


which implies 
wi/2+b=1 and w2/2+b=1. 


For two negative points, we have 


(wi, W2) lea +b=-1, 


(12) 
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Fig. 4 Example 2, 
situation 3 


and 
0 


implies that 


1 1 
a a and =a: 


From (12) and (13), we obtain w = (2, 2) and b = 0, where ||w|| = 2/2. 


Now we solve the same problem for all three situations by using GSVM. 
For two positive points of situation 1, we have 


W111 W12 1 4 by, = 1 
w21 W22 | | 0 by} | 1d? 
W111 W112 0 a by _ 1 
W21 W22 | {| 1 bo | | 1d? 
which implies 
W111 by _ 1 W112 by -_ 1 
Plea bee leal 


Again, for the negative points, we have 


Wi1Ww —1/2 i ae 
[im me|L 0 | * Le ]= [=i] 
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(13) 


(14) 
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and 


which gives 


; \ 
eth Wg | ce || 8 —jwn] [oi] f-1 
re + [8] =[21] _ -e™ |+[2]=[2]. (15) 


From (14) and (15), we get 


v-[f 
4/2 4/2 


mew MO= Cag 


Thus we get 


Hence we get w = , 3) that minimize G (w;) fori = 1, 2. 


Now, for positive points of situation 2, we have 
W111 W12 1/2 + by = 1 
W21 W22 0 bo} | 1d? 
wii Wwi2 0 4 bb} {1 
W21 Wo2 | | 1/2 bo | | 1d? 
which gives 


1 1 
5Wi b}|] [1 2Wi2 by} _]1 
fim *L=] = [ee] BR ]=[]- 9 


For two negative points for this case, we have 


Eo laueaet 
eer alee 


and 


and 
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which gives 


Thus, we obtain that 


44 3 
w=[is and o=|)]. 
55 5 


4/2 4/2 
ao 2 


Thus we get 


min G(w;) = ( 


ie{1,2} 


Hence we get w = , 3) that minimize G (w;) fori = 1, 2. 


For the positive points of the combined data for situation 3, we have 
wit Wi2 | | 1/2 ai b,|_ | 1 
W21 W22 0 bo} | 1]? 
Wii Wi2 0 *: bb} {1 
W21 Wa2 | | 1/2 bo} | 1]? 


and 


which gives 


I 1 
2Wil by 1 2W12 by 1 
= and = . 
Hata i] Hata i] 


For two negative points for this case, we have 
wit Wi2 | | — 5 . by a= 1 
W21 W22 0 by —1]’ 
Wit Wi2 0 by —1 
i + = ’ 
Wai W22 | | —3 bo -1 


and 


which gives 


365 


(18) 
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From this, we obtain that 


Thus we get 
min G (w;) = (2V2, 2V2). 
réql, 
Hence we get w = (2, 2) that minimizes G (w;) fori = 1, 2. 
Proposition 1 Let G : R" > R‘, be a differentiable operator. An element w* € R" 
minimize G if and only if G' (w*) = 0, that is, w* € R" solves GSVM if and only if 
G’ (w*) = 0. 
Proof Let G’ (w*) = 0, then for all v € R”, 
< 7G’ (w*),v-w' > = <0,v-w'> = 0. 
Consequently, the inequality 
< 7G’ (w*),v-w' > = <0,v-w> >0 


holds for all v € IR”. Hence w* € R” solves the problem of GSVM. 
Conversely, assume that w* € R” satisfies 


< nG' (w*) ,.v—w* > >0VveER". 
Taking v = w* — G’ (w*) in the above inequality implies that 
< nG' (w*),-G'(w*) > = 0, 


which further implies 
—n|IG'ww*)|? > 0. 


Since n > 0, we get G’(w*) = 0. oO 


Definition 3 Let K be a closed and convex subset of R”. Then, for every point 
x € R", there exists a unique nearest point in K, denoted by Px (x), such that 
|x — Px (x)|| < ||x — y|| for all y € K and also note that Px (x) = xifx € K. The 
mapping Px is called the metric projection of R” onto K. It is well known that 
Px : R" — K is characterized by the properties: 


(i) Px (x) =z for x € R” if and only if <z,y-—z>><x,y—z> forallye 
R"; 
(ii) For every x, y ¢ R”, || Px (x) — Px (yl? < <x —y, Px (x) — Px (y) >; 
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(iii) || Px (x) — Px (y)|| < ||x — yll, for every x, y € R”, that is, Px is a nonexpan- 
sive map. 


Proposition 2 Let G : R" > R', be a differentiable operator. An element w* € IR" 
minimizes mapping G defined in (3) if and only if w* is the fixed point of map 


Par (I = pG’) :R" > Ri. forany p> 0. 
that is, 


w= Par (I = pG’) (w*) 
= Pax (w* — pG' (w")) 


where Pg is a projection map from R" to R’.. 
Proof Suppose w* € RR", is solution of GSV M. Then for 7 > 0, we have 
< nG' (w*),w—w*> > 0 forall weR’. 
Adding < w*, w — w* > on both sides, we get 
<ww-w > +<7G' (w*),w—w > ><w*,w—w*> forall we R’, 
which further implies that 
<w,w-w' >> <w*—7G' (w*),w—w*> forall weR’, 
which is possible only if w* = Pg (w* —pG' (w*)), that is, w* is the fixed point 
of G’. 
Conversely, let w* = Pr (w* —pG’ (w*)). Then we have 
<w,w—-w >> <w*—7G'(w),w—w*> forall weR’, 
which implies 


<W,w-w'>-—<w* nG' (w*),w—w* >> 0 forall we R’, 


and so 
< 1G (w*),w-w >> 0 forall weR’. 


Thus w* ¢€ R’ is the solution of GSVM. oO 


Definition 4 A map G : R” — R’ is said to be 
(1) L-Lipschitz if for every L > 0, 


IG (x) -G(y)|| < Lilx—yll forall x,yeR’. 
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(II) monotone if 
< G(x)—G(y),x-y> => 0 forall x,yéR’. 
(II) strictly monotone if 
< G(x)—G(y),x-—y> > 0 forall x,y¢eR” with xy. 
(IV) e@-strongly monotone if 
< G(x)-—G(y),x-y> = a ||x — yl for all x,y € R”. 


Note that, every a-strongly monotone map G : R” — R" is strictly monotone and 
every strictly monotone map is monotone. 


Example 3 Let G : R" — R" be a mapping defined as 
G (w;) = aw; + B, 


where @ is any non negative scalar and # is any real number. Then G is Lipschitz 
continuous with Lipschitz constant L = a. 
Also, for any x, y € R”, 


< G(x)-— G(y),x-y> = allx—yll’, 


which shows that G is a-strongly monotone. 


Theorem 1 Let K CR" be closed and convex and G':R" > K is strictly 
monotone. If there exists a w* € K which is the solution of GSV M, then w* is 
unique in K. 


Proof Suppose that w;, w; € K with w} 4 w3 be the two solutions of GSV M. Then 
we have 
< 7G’ (wi),w-—wi> > 0 forall weR’, (20) 


and 


< G' (w3),w—w3> > 0 forall weR’, (21) 
where 7 > 0. Putting w = w3 in (20) and w = wy; in (21), we get 
< 7G’ (wi),w;-—wi > = 0, (22) 


and 
< nG' (w3),wi—w3> > 0. (23) 


Equation (22) can be further written as 
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< —nG’ (wi), wi-w3> > 0. 
Adding (23) and (24) implies that 


< G’ (w3) — nG’ (wt) , wi — w3 > 


IV 
— 


which implies 
n < G' (wi) — G' (w3), wi —w5 > < 0, 


or equivalently, 


IA 
i=) 


<G’ (wi) —G (w3) ,Wi —W5 > 
Since G’ is strictly monotone, we must have 
< G’ (wi) — G’ (w3), wi —w3 > > 0, 


which contradicts (25). Thus wy = w3. 
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(24) 


(25) 


Theorem 2 Let K C R" be closed and convex. If the map G': R" > K is L- 
Lipchitz and a-strongly monotone, then there exists a unique w* € K which is the 


solution of GSV M. 


Proof Uniqueness: Suppose that w}, w; € K be the two solutions of GSV M, then 


for 7 > 0, we have 


< 7G’ (wi),w-—wi> = 0 forall weR’, 


and 


< nG' (w3),w-—w;> > 0 forall weR’. 


Putting w = w; in (27) and w = w; in (28), we get 
< 7G’ (wi),w;-wi> > 0 


and 


IV 
S 


< nG' (w3) , wi — w3 > 
Equation (29) can be further written as 

< —nG' (w}),wi-—w5 > > 0. 
Adding (30) and (31) implies that 


< nG’ (w3) —nG’ (wi), wi —w5 > = 0, 


(27) 


(28) 


(29) 


(30) 


(31) 


(32) 
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which implies 
n < G' (wi) — G'(w3),wi -w5 > < 0. (33) 


Since G’ is a-strongly monotone, we have 
ain | wi — w3l|” <1 < G! (wi) — G' (w3) wi — w5 > 
< 0, 


which implies that 
an ||wi — w3 |” < 0. 


As an > 0, we conclude || wy — Ws | = 0 and hence w} = w3. 
Existence: As we know, if w* € R‘, is solution of GSV M, then for n > 0 we 


have 
< nG' (w*),w—w*> > 0 forall weR’, 


if and only if 


w* = Pr (w* — pG'(w*)) (34) 
=F (w*) (say). 


Now for any wy, w5 € Ri, we have 


| ¥ (wi) — F (w3) |? 


| Pex (ws — pG' (wi) — Pay (w3 — pG'(w3)) |" 
< |(w — eG’ (wi) — (w3 — G'(w3)) ||? 

= | (wi — w5) — 1G Ww) — G'w) 1 

=< (wj—w3) — p[G'(w})—G'(w3)], (w}—w3) — eLG' (wi )—G'(w3)] > 
= |wi-w3 |’ — 29 < wi-w, G'(wi)—G'(w5) > 


+p" |G w)-G'w3) |’. 


(35) 


Since G’ is L-Lipchitz and a-strongly monotone, we get 


| F (wi) — F (w3) |? < [wi — w5l]? - 200 | wi - w5 |” 
nr ge be || wi — Ww; |’ 
2 


’ 


= (1+ p°L* — 2pa) ||w; — w3 


that is 
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|F (wi) — F (wa) <6 wi — 


, (36) 


where 0 = \/1 + p?L? — 2pa. Since p > 0, so that when p € (0, 2a), then we get 
6 € [0, 1). Now, by using Banach contraction principle,, we obtain the fixed point 


of map F, that is, there exists a unique w* € Ry such that 


F (w*) = Pp (w* — pG'(w*)) 


=w. 


Hence w* € Ri is the solution of GSVM. 


oO 


Example 4 Let us take the group of data of positive class (a, @2,...,Q@n-1,0), 


(1, 2, ...,An_2,0, dy), ..., (0, 2, 3, ..., &,) and negative class (ka, ka2,.. 


kay, 9), (kay, ka, ..., KOy—2, 0, katy), ..., (0, kaz, ka3,... 


ka,) forn > 2, where eacha; 4 0 fori € Randk 4 1. 
A map G : R” > R‘, is given by 


G (wi) = (will. llwill,.... Iwill) for i= 1,2,...,n, 


where w; are the rows of W,,., fori = 1,2,...,n. Then we have 


1 
G’ (w;) = ——w; for i=1,2,...,n. 


II will 


Now from the given data, we get 


i 1 
a Oy 
f) ee eee ge 
2 ay a2 Qn, 
WwW = — ae . 
(n—1)(1—k) ; 
hy ae 
ay a2 Qy 
and so we have 
G (w;) 2 ieee £2 ti 1) for i=1,2 
Wi= se ee as MD mening OF £1, 2 cts 
“ @-DA-by a2 af an 
and 
' 1 1 1 1 . 
G' (w;) = ee for i=1,2,...,n. 
2 eh ol Ney Og Qn 
ay as an 


Note that, for any w,, W2 € W, 


“9 


’ 
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|G’ (wi) — G’ (wo) || = 0 = L Iwi = wall 

is satisfied, where L is any nonnegative real number. Also 
< G' (w,) — G’ (w2), Wi —W2 > > O, 

Lod Ly 


2 ( 
(n—1)(1—k) Say? a? °°? Gy 
is the solution of GSVM with || w]| = Gi a a : + 2 fees tf l 


a2" 
in 


which shows that G’ is monotone operator. Moreover, w = 


Example 5 Let us take the group of data of positive class (a, @2,...,Q@m,0,0,..., 
0), (0, a2, 03, ..-,@m41,0,0,...,0),..., (@1, 2, .--,Anm—1, 0, 0,...,0, a,) and 
negative class (ka), KQ,...,KQm,0,0,...,0), (0, Kaz, Ka3,..., Km +1, 0, 
O,...,0),..., (KQy, KQ2,...,KQm_—1,0,0,...,0,ka,) forn > m > 1, where each 
a; #0 fori e Nandk Al. 

A map G : R” > R‘, is given by 


G (wi) = (Ilwill, llwill.---, twill) for i= 1,2,...,0, 


where w; are the rows of W,,.., fori = 1,2,...,n. Then we have 


1 
G’ (w;) = ——-w; for i=1,2,...,n. 


II will 


Now from the given data, we get 


et acu le 
2 a, a On 
~m(-ky| 
ee! Sarre 
a, a Qn 
and so we have 
Gis a ee ee PSD 
w;) = >+---+—7(,1,..., or i=1,2,...,n, 
m(1—k) a? a5 7 
and 
, 1 1 1 1 . 
G (w;) = ales for i=1,2,...,n. 
booq. th \O1 m% On 
ay ay i 


It is easy to verify that G’ is monotone and Lipchitz continuous operator. The vector 
w=—2—(1,1,..., +) solves GSVM and ||w|| = — P+it4..-44 


m(1—k) Say,’ a2? m(1—k) ae a a2" 
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Example 6 Consider (@;,0,0), (0,a@2,0), (0,0,a3), (61,0,0), (0, 62,0), 
(0, 0, 63) as data of positive class and (ka;, 0,0), (0, ka2,0), (0,0, ka3), (kB, 
0,0), (0, kB2, 0), (0, 0, 63) as negative class of data, where a;, 6; and k are positive 
real numbers with each a; < 6; fori = 1,2,3 andk # 1. 

The map G : R" — R‘ is given as 


G (wi) = (Ilwill, llwill.---, will) for i= 1,2,...,0, 


where w; are the rows of W3,.3 fori = 1, 2, 3. Then we have 


1 
G' (w;) = ——-w; for i=1,2,3. 


Il wil 
Now from the given data, we get 
ee 
a a2 a5 
Waal aad 
a ata 
a a a5 


and so we have 


G (wi) =a gee ae Des dia aaeee, 
Wi) = , ; , 
(1 —k) a? as as a? as a3 at as as 


and 


; 1 1 1 1 
G (wi) = ont : 
t+ h4t \O1 M2 3 
Note that, for any w;, w2 € W, 
|G’ (wi) — G’ (wo) |] = 0 = L Iw, — wall 
is satisfied for L > 0. Also 
< G’ (wi) — G’ (w2), wi -wW2 > > 0 


is satisfied, which shows that G’ is monotone operator. Moreover, w = 


1 1 1); : ‘ _ 2 1 1 1 
Ga PA 53) is the solution of GSVM with ||w|| = 7) ‘a + 2 + af: 


Example 7 Letus take the group of data of positive class (1, 0,0), (1, 1, 0), (0, 1, 1) 
and negative class (-}, 0, 0), (-}, —3, 0), (0, —}; —3). Now from the given data, 
we have 


pao 
(-k) 
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4n 4 
393 
4n 4 
4n4 
303 
with 4 
G (wi) = 3/2, V2, v2) for i =1,2,3, 
and 


= 
V2 


It is easy to verify that G’ is monotone operator and Lipchitz continuous. Moreover, 
w = (4,0, 3) is the solution of GSVM with ||w|| = $2. 


G' (wi) = (1,0,1) for i=1,2,3. 


4 Conclusion 


Recently many results appeared in the literature addressing the problems related 
to the support vector machine and its applications. In this paper, we initiated the 
study of generalized support vector machine and presented linear classification of 
data by using support vector machine and generalized support vector machine. We 
also provided sufficient conditions under which the solution of generalized support 
vector machine exists. Various examples are also presented to show the validity of 
these results. Furthermore, one can study the results of generalized support vector 
machine for nonlinear classification of data. 
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Linear and Nonlinear Classifiers of Data 
with Support Vector Machines 

and Generalized Support Vector 
Machines 


Talat Nazir, Xiaomin Qi and Sergei Silvestrov 


Abstract The support vector machine for linear and nonlinear classification of data 
is studied. The notion of generalized support vector machine for data classifications is 
used. The problem of generalized support vector machine is shown to be equivalent to 
the problem of generalized variational inequality and various results for the existence 
of solutions are established. Moreover, examples supporting the results are provided. 


Keywords Linear and nonlinear classification - Support vector machine - General- 
ized support vector machine - Kernel function 


1 Support Vector Machine 


Support vector machines (SVM) [2, 3, 13, 14, 18] were developed by Vapnik et al. 
(1995) and are gaining popularity due to many attractive features. As a very powerful 
tool for data classification and regression, it has been used in many fields, such as 
text classification [5], facial expression recognition [9], gene analysis [4] and many 
others [1, 6-8, 10-12, 17, 19-22]. Recently, it has been used for faults classification 
in a water level control system [15]. And a faults classifier based SVM is used to 
diagnose the faults for a water level control process [16]. 
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The classification problems can be restricted to consideration of the two-class 
problems without loss of generality. The goal of support vector classification (SVC) 
is to separate the two classes by a hyperplane which can also work well on unseen 
examples. The method is to find the optimal hyperplane that maximizes the margin 
between two classes of data. The set of data is said to be optimally separated by the 
hyperplane if it is separated without error and the distance between the closest data 
is maximal. Support vector classification can be thought of a process using given 
data to find the decision plane which can guarantee good predictive performance on 
unseen data. And the process of finding the decision plane is a quadratic programming 
process. 

In this paper, we study the problems of support vector machine and generalized 
support vector machine. We also show the sufficient conditions for the existence 
of solutions for problems of generalized support vector machine. We also present 
various examples to support these results. 

Throughout this paper, by N, R, R” and R* we denote the set of all natural 
numbers, the set of all real numbers, the set of all n-tuples real numbers, the set of 
all n-tuples of nonnegative real numbers, respectively. 

Also, we consider ||-|| and < -,- > as Euclidean norm and usual inner product 
on R”, respectively, such as, < x,y >= X.y = x,y) +2X0y2 +--+ +Xyy, for all x = 
(X1,X2,---;Xn), Y = (1, Y2,---; Yn) in R". Furthermore, for any two vectors x, y € 
R", we say that x < y if and only if x; < y; for alli € {1,2,...,}, where x; and y; 
are the components of x and y, respectively. 


1.1 Data Classification 


Actually, complex real-world applications are always not linearly separable. Kernel 
representations offer an alternative solution by projecting the data into a higher 
dimensional feature space to increase the computational power of the linear learning 
machine. 

In order to learn linear or non-linear relations with a linear machine, a set of non- 
linear features is selected. This is equivalent to applying a fixed non-linear mapping 
function @ that transforms data in input space X to data in feature space F , in which 
the linear machine can be used. For this classification, both spaces X and F need 
to be vector spaces, where dimension of these two spaces may or may not be same. 
When the given data is linearly separable, we consider ® as identity operator. For 
binary classification of data, we consider the decision function f : R” — R, where 
the input x = (x1, ...,X,) 18 assigned to the positive class if, f(x) > 0 and otherwise 
to the negative class. The decision function is defined as 


f (&) =< w, @ (x) > +b. (1) 
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This means two steps will be built for non-linear machine: first a fixed non-linear 
mapping of the data to a feature space, and then a linear machine is used to classify 
them in the feature space. 

In addition, the vector w is a linear combination of the support vectors in the 
training data and can be written as 


W= > aid (x), (2) 


where each a; is Lagrange multiplier of the support vectors. 
So the decision function can be rewritten as 


f@=o (= ou;(P (x;) - ® (x) + ) (3) 


where o is a sign function. 
The Kernel K has an associated feature with mapping @ , and it takes two inputs 
and give their similarity in feature space F , that is, K : F x F — R is defined as 


K(x, X) = @ (x;)- @ (x). (4) 
Thus, the decision function from (3) becomes 


f ®) =0(> K(x, x) +B). (5) 


Some useful kernels for real valued vectors are defined below: 


(I) Linear kernel 
K(x;, X) = Xx; -X. 


(II) Polynomial kernel (of degree p) 
K(x, x) = (x - x)? or (x; -x +1), 


where p is a tunable parameter. 
(IIT) Radial Basis Function (RBF) kernel 


K(x;, x) = exp[—y||xi — x||1, 
where y is a hyperparameter (also called kernel bandwidth). The RBF kernel 
corresponds to an infinite feature space. 
(IV) Sigmoid Kernel 
K(x;, X) = tanh (kx;-x + 0), 


where k is a scalar and 6@ is the displacement. 
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(V) Inverse multi-quadratic kernel 


_9\—-1/2 
K(x, x) = (lx) x? +7)”, 


where y is a hyperparameter (also called kernel bandwidth). 


Now, from (1), we define the functional margin of an example (@ (x;) , y;) with 
respect to a hyperplane (w, b) to be the quantity 


vi = yi Cw, ® (x;)) +B), 


where y; € {—1, 1}. Note that y; > 0 implies correct classification of (x;, y;) . If we 
replace functional margin by geometric margin we obtain the equivalent quantity for 
the normalized linear function (mT w, _ b), which therefore measures the Euclidean 
distances of the points from the decision boundary in the input space. 


Actually geometric margin can be written as 


P| 
y=): 
II wll 


To find the hyperplane which has maximal geometric margin for a training set S 
means to find maximal y. For convenience, we let y = 1, the objective function can 
be written as 


max ——. 
II wl 


Of course, there are some constraints for the optimization problem. According to 
the definition of margin, we have y; ((w, ® (x;)) + b) > 1,i=1,...,/. We rewrite 
in the equivalent form the objective function with the constraints as 


1 
min 5 |w||? such that y; ((w, ® (x))) +b) > 1, 7=1,..., 0. (6) 


We denote this problem by SVM for data classification. 


Example 1 Let’s take the group of points (0,2), (0,—2),(d,1),d,-1), 
(—1, 1), (—1, —1) as positive class and the group of points (2, 0), (—2, 0), (2, 1), 
(2, -1), (—2, 1), (—2, —1) as negative class shown in Fig. 1. 

By using the mapping function 


@ (x) = (xf, V2x1%2, 3) : 


which transforms data from two-dimensional input space to three-dimensional 
feature space, that is (1, xf, 1), d, =D, 1) and (0,0, 4) as positive class and 
(4, 2/2, 1), (4, -2/2, 1) and (4, 0, 0) as negative data shown in Fig. 2. 
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Fig. 1 The data points given 
in Example | 


Fig. 2 The data separation 
in three dimensional feature 
space 
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Now by using this data in three dimensional feature space, we consider the fol- 


lowing: For positive points, we have 


1 
(wi, Wo, w3) | /2 | +b> 1, 
1 
1 
(wi, W2, 3) | —/2 | +b > 1, 
1 
0 
(wi, W2,w3)| 0)+5> 1, 
4 


which implies 


wi t+ V2w2+w3+b > 1, 
Ww, —V2w,+w3 +5 = 1, 
4w3 +b> 1. 
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For negative points, we have 


4 
(W1, W2, W3) a3 +b<-l, 
1 


4 
(W1, W2, 3) | —2V2 | +b < -1, 
1 


4 
(wi, W2,w3)}| 0} +b< —I, 
0 


implying that 


Aw, + 2V 2wa + ws +b <= —1, 
Aw, —2V2w. +w3+b < —1, 
4w,+b<-l. 


From the equations, we get w = (—0.6667, 0, 0) with ||w|| = 0.6667 and shown 
in Fig. 3. 

Further, if we use Radial Basis Function (RBF) Kernel K (x;, x) = exp[—y||x; — 
x||7], with y = 1/3, we get w = (0.0031, 0.0012) which is shown in Fig. 4. 

Also if we use Sigmoid Kernel K(x;, x) = tanh (kx; -x + 0) with k = 1/3 and 
6 = 2.85, we get w = (0, 0) shown in Fig. 5. 


Example 2 Let us look at another example. The positive data be shown as red square 
and the negative data be shown as blue circle respectively as shown in Fig. 6. 

It is also a non-linear separable problem. Now, if we transfer the original data into 
the feature space by using the mapping function ® (x), we can see that the data in 
the feature space is linear separable see Fig. 7. 


Fig. 3. The data separation 2 = 
using Polynomial Kernel of 
degree 2 


oa 
= 
N 


=2 4 
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Fig. 4 The data separation 2 
using Radial Basis Function 


(RBF) 1.5 


Fig. 5 The data separation 2 
using Sigmoid Kernel 


Fig. 6 The data points given 
in Example 2 
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Fig. 7 The data separation 
in feature space of 


Example 2 
Fig. 8 The data separation 3m ++ = = 
of Example 2 using 
Polynomial Kernel 2} 
a 
1 
Om 
-1t 
* 
-2t 
-3a - = ; a - 4 
“3 c 1 0 1 2 3 
Fig. 9 The data separation 3m- a a a - 
of Example 2 using RBF 
Kernel 2 
a 
at J 
ie] a 
-1 4 
a 
“2 4 
-3 — a = > a 
-3 -2 -1 ie) 1 3 


Using Polynomial Kernel with p = 2, we get w = (—0.4898, —0.1633) which is 
shown in Fig. 8. 

Next if we use Radial Basis Function (RBF) Kernel K (x;, x) = exp[—y||x; — 
x||7], with y = 2, we get w = (—0.0016, 0.0014) as shown in Fig. 9. 


Example 3 Consider the points (0, 0), (1,0), (—1, 0) as positive class and points 
(2, 0), (3, 0), (—2, 0) , (—3, 0) as negative class see in Fig. 10. 
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Fig. 10 The data points 
given in Example 3 


Fig. 11 Data separation of 
Example 3 by using 
Polynomial Kernel of 
degree 2 


Note that, no linear separator exists for this data in the input space. Now, if we 
use ® (x) = Ge. /2x4x0, x2), then it transforms two-dimensional data into three- 
dimensional feature space, which can be separated by hyperplane H as shown in the 
Fig. 11. 


2 Generalized Support Vector Machines 


Consider a new control function F : R’ — R? defined as 
F (x) = W® (x) + B, (7) 
where W € R?*?, B € R? are parameters and p is the dimension of feature space. In 


addition, W contains the w; as a row, where each w; is the linear combination of the 
support vectors in the feature space and can be written as 


wi = 195" (x), (8) 
yh 


where @ is a mapping that transforms data in input space X to data in feature space 
F . From (7), we obtain 
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joy ® (x;) 


F (x) @ (x) +B 


ij o ® (x) 


ju ® (x;) @ (x) 
ba a) p (xj) ® (x) 
joy K(X, x) 
; +B 
joy K(%, x) 


ee 
J 


: K(x;, x) + B, 

5 a” 

j 
where K(x;, X) is the kernel having associated feature with mapping @. 

Define 

Ye = Ye (WO (x) + B) 
(1) 
yi a; 


ye 
= y(CK(x;,x)+B)>1 for k=1,2,...,], 


where y; € {(—1, —-1,..., -1), , 1, ..., 1} is a p-dimensional vector, K (x;, x) = 


ba aj” 
@ (x) ® (x) and = : 
Dai a” 


Definition 1 We define a map G : R? > R’, by 
G (wi) = (lwill, llwill,..-, Ilwill) for i= 1,2,...,p, (9) 


where w; are the rows of W,,., fori = 1,2,...,p. 


Now, the problem is to find w; € R” that satisfy 


min G(w;) suchthat 7 > 0, (10) 


Wwic 


where 7 = yx (CK (x;, x) +B) —1. 
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We call this problem as the Generalized Support Vector Machine (GSVM). 
Lee 
j 
Note that, if : K(x;, x) = —B, then 7 = —1 and we obtain no solution 
(p) 
-Q- 


jd 
of GSVM problem. 


Example 4 Consider the data of points for positive and negative class as given in 
Example 1. Then by using polynomial Kernel of degree two, we obtain (1, V2, 1), 
(1, —V2, 1), (0, 0, 4) the vectors of positive data and (4,2/2, 1), (4, -2V2, 1), 
(4, 0, 0) the vector negative data in feature space. From positive data points, we have 


Wi Wi2 W123 1 by 1 
w21 W22 W23 | | V2} 4+] a) >] 1], 
W31 W32 W33 il b 1 
Wit Wi2 WB 1 by 1 
W21 W22 W23 —J2}4+}hb]>] 1], 
W31 W32 W33 1 b3 1 
Wit W12 W13 0 by 1 
W21 W22 W23 O;+}b)=] 1], 
W31 W32 W33 4 b3 1 
which gives 
wy tV2wi2 + wig + by > 1, 
wo + V2wo2 + w23 + bo > 1, 
w3 + V2w32 + w33 + b3 > 1, 
wi —V2wi2 +wpth) = 1, 
wo — V2W9 + W23 + by > 1, 
w3, — V2w32 + w33 + b3 > 1, 


4wi3+b, > 1, 
4w23 + bo > 1, 
4w33 + b3 > 1. 
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Also from negative data points, 


Wi Wi12 W123 4 b, —1 
W21 W22 W23 2/2|/+})b|<|-1], 
wW31 W32 W33 1 b3 —1 
Wit Wi2 W13 4 by —1 
W21 W22 W23 —2/2|+|b|<|-11], 
W31 W32 W33 1 b3 —l 
Wi Wi12 W13 4 by —1 
W21 W22 W23 O;};+}]b2}<]-l], 
W31 W32 W33 0 b3 —l 


which gives 


4wi + 2V2wi + wath) <—1, 
Aw + 2V2wo2 + w3 + bo < —1, 
4w31 + 2V2w32 + w33 + b3 < —1, 
4wu — 2V2wi + wath <—1, 
Aw — 2V2wo2 + w3 + bo < —1, 
4w3, — 2/2w39 + w33 + b3 < —1, 
4wi, +b, < -l, 
4wi2 + bz < —I, 
4wi3 + b3 < —1. 
By solving these equations, we get 
—1.39 —0.512 —0.627 3.742 
W =| 0.667 0 —0.667 and B= | 1.047 |, 
0.667 0 0 1.51 


with smallest norm of w; 


min G (w;) = (0.667, 0.667, 0.667). 
wiew 


Hence we get w = (0.667, 0, 0) that minimize G (w,) for i = 1, 2, 3. 
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Fig. 12 Data for situation 1 
in Example 5 


If we are dealing with the data that can linearly separable, then in the process of 
GSVM, map @ deals as identity operator. The next example we show the situations 


for this case. 


Example 5 Let us consider the three categories of data: 


Situation 1 Suppose that we have data (2, 0) , (0, 2) , (2, 1) as positive class and 
data (—1, 0) , (0, —1) , (—1, —1/2) as negative class shown in Fig. 12. 


For positive points, we have (2, 0), (0, 2), (2, 1), so 
Wit Wi2 | | 2 by 1 
[rae |fo]*[] = [1] 
Wii Wi2 
W21 W22 


Wii W12 

w21 W22 
2wii by 1 
[on |*[o]= [1], 


2wi2 by 1 
mel +[o =f]: 

2wi1 + wi2 by 1 
> ‘ 
Ee | " 7. | ~ | 


which implies 


Again, for the negative points, we have (—1, 0), (0, —1), (—1, —1/2) and 


[rote dL o ]+ [Le] = [21]; 
ed elit 
we Le) + [2] = [=i] 
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Fig. 13 The data separation ¥ 
for situation 2 
~~ 
= x 
eS = > 


which gives 


Thus we get 


anomie (=. 7). 


Hence we get w = , 2) that minimizes G (w;) fori = 1, 2. 

Situation 2 We consider the data (1,0), (0, 1), (1/2, 1) as positive class, data 
(—4, 0) , (0, —4) , (—2, —4) as negative class which is shown in Fig. 13. 

Now, for positive points of Situation 2, we have (1, 0), (0, 1), (1/2, 1) and 


Li we Lo] + [2t]2 [1] 
Le lL] + [2 ]= Ly]; 
Beale ek 


’ 
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W111 by 1 
fad *LeJ= Li) 
W112 b, 1 
[mal *LeJ= La) 
1 
zWiu+w b 1 
[Foctme! Lae Lt 
Wai + W229 2 
For negative points for this case, we have 
Wil Wi12 —4 by —1 
[im |Lo [+ [o]= [21] 
Wii W12 0 ip, by < —1 
Ww21 W22 —4 by | —}-1 
W111 W112 —2 i. by 2 —1 
W21 W22 —4 by | —}-1 


—4w1 by —1 
[mtn +L] = [21], 

—4w1> by —1 
[tee * Le] = [21], 

—2wi\ = 4wi2 by —1 
[pen ame | [2] = [21] 


Thus, we obtain that 


which gives 


which gives 


22 3 
Wel ae and a=|3 
5 5 5 
Thus we get 
2/2 2/72 
min G(w;) = 2v/2 2v2 ; 
ie{1,2} 5 5 


Hence we get w = (2, 2) that minimize G (w;) for i = 1, 2. 
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Fig. 14 The data separation 
for situation 3 


In the next Situation 3, we combine of this two groups of data. Now, we have data 
(2,0), (0, 2), (2, 1), C1, 0), (0, 1), (1/2, 1) as positive class and (—1, 0), (0, —1), 
(—1, —1/2), (—4, 0), (0, —4), (—2, —4) as negative class see Fig. 14. 

For the positive points of the combination, we have 


[ios wetto|*Le=[1]: 
ee lal Palla 
od tLede Li} = [ee] +Le = br) 


For negative points for this case, we have 


Esler 
eee elie 


which gives 


and 


which gives 


Linear and Nonlinear Classifiers of Data ... 393 


From this, we obtain that 
11 0 
w=[iy| and B=]. 


min G (wi) = (V2, v2). 


Thus we get 


Hence we get w = (1, 1) that minimize G (w;) for i = 1, 2. 
The problem of GSVM defined in (10) is equivalent to 
find wie W: (G' (wi),V— wi) >0 forall ve R’ with n>=0. (11) 
Hence the problem of GSVM becomes to the problem of generalized variational 
inequality. 
Note that it we take G’ (w;) = Tw? then from (11), we obtain 


find w,;eW: (wi,v—wi)>O0 forall ve R’ with n>0, (12) 


or 


find w;eW: (wi,v) => Iwill? forall ve R’ with n>0. (13) 


We study the sufficient conditions for the existence of solutions for GS VM prob- 
lems. 


Proposition 1 Let G : R? > R’, be a differentiable operator. An element w* € R? 

minimizes G if and only if G' (w*) = 90, that is, w* € R?’ solves GSVM if and only if 

G’ (w*) = 0. 

Proof Let G’ (w*) = 0, then for all v € R? with n = y, (CK (x), x) + B) —1>0, 
< G' (w*),v-w' >=<0,v-w >= 0, 

and consequently, the inequality 


< G' (w*),v-w' >> 0 


holds for all v € R’. Hence w* € R? solves problem of GSVM. 
Conversely, assume that w* € R’ satisfies 


< G'(w*),v—w* >>0 VveR" suchthat > 0. 
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Taking v = w* — G’ (w*) in the above inequality implies that 
< G’ (w*),-—G’ (w*) >> 0, 


which further implies 
—||G'(w*)|/? 


and we get G’(w*) = 0. 


Remark I Note that if G’ (w*) = 0 at some w* € R’, then we obtain ~_ ia = 0 which 


implies w* = 0. Thus it follows from Proposition 2.4 that if G’ (w*) = 0 at some 
w* € R’, then w* = 0 solves GSVM problem. 


Remark 2 If w* = 0, then from (8), we obtain 
(*) _ 
> 4)? (x) 
j 


which implies 


Lae (x;) ®@ (x) = 0, 


that is 
2 K (xj, x) = 0. (14) 


Since a > 0 for all j, so we have 
K (x;, x) = 0. 


Definition 2 Let K be a closed and convex subset of R”. Then, for every point 
x € R”, there exists a unique nearest point in K, denoted by Px (x), such that 
|x — Px (x)|| < ||x — y|| for all y € K and also note that Px (x) = x if x € K. Px 
is called the metric projection of IR” onto K. It is well known that Px : R” > K is 
characterized by the properties: 


(i) Px (x) =z for x € R” if and only if <z—x,y-—z>>Oforally € R’; 
(ii) For every x, y € R”, ||Px (x) — Px (y) II" <<x-—y, Pr (x) —Px(y) >; 
(iii) ||Px (x) — Px (y)|| < |lx — y|l, for every x, y € R”, that is, Px is nonexpansive 
map. 


Proposition 2 Let G : R? > R‘, be a differentiable operator. An element w* € R? 
minimize mapping G defined in (11) if and only if w* is the fixed point of map 


Pr. (I — pG’) : R’ > R‘, for any p > 0, 
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that is, 


w* = Pp (I — pG’) (w*) 
= Par (w* — pG' (w*)), 


where Pg is a projection map from R? to R’_ and n= yx (CK (xj, x) + B) —1>0. 


Proof Suppose w* € R‘. is solution of GSVM. Then for n = yx (CK (x, x) + B) - 
1 > 0, we have 


< G'(w*),w-—w* >> 0 forall we R’. 

Adding < w*, w — w* > on both sides, we get 
<w ww >4+<G (w),w—w' >><w,w-w*> forall we R’, 
which further implies that 

< w* — (w* —G' (w*)),w—w* >> 0 forall weR’, 

which is possible only if w* = Pg (w* — pG’ (w*)), that is, w* is the fixed point 
of G’. 

Conversely, let w* = PR (w* — pG’ (w*)) with 7 = yx (¢K(x;, x) + B) -1> 
0, then we have 


< w* — (w* — G’ (w*)),w—w* >> 0 forall we R’, 


which implies 
< G' (w*),w-—w* >> 0 forall weR’, 


and so w* € Ri is the solution of GSVM. 


3 Conclusion 


The linear and nonlinear data classifications by using support vector machine and 
generalized support vector machine have been studied. We also studied the sufficient 
conditions for existence of the solution of generalized support vector machine. Some 
examples are shown for supporting these results. 
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Common Fixed Points of Weakly Commuting 
Multivalued Mappings on a Domain of Sets 
Endowed with Directed Graph 


Talat Nazir and Sergei Silvestrov 


Abstract In this paper, the existence of coincidence points and common fixed points 
for multivalued mappings satisfying certain graphic w-contraction contractive condi- 
tions with set-valued domain endowed with a graph, without appealing to continuity, 
is established. Some examples are presented to support the results proved herein. 
Our results unify, generalize and extend various results in the existing literature. 


Keywords Common fixed point - Multivalued mapping - Graphic contraction - 
Directed graph 


1 Introduction and Preliminaries 


Order oriented fixed point theory is studied in an environment created by a class of 
partially ordered sets with appropriate mappings satisfying certain order condition 
like monotonicity, expansivity or order continuity. Existence of fixed points in par- 
tially ordered metric spaces has been studied by Ran and Reurings [26]. Recently, 
many researchers have obtained fixed point results for single and multivalued map- 
pings defined on partially ordered metrics spaces (see, e.g., [6, 8, 18, 25]). Jachymski 
and Jozwik [20] introduced a new approach in metric fixed point theory by replac- 
ing the order structure with a graph structure on a metric space. In this way, the 
results proved in ordered metric spaces are generalized (see also [19] and the refer- 
ence therein); in fact, in 2010, Gwodzdz-Lukawska and Jachymski [17], developed 
the Hutchinson-Barnsley theory for finite families of mappings on a metric space 
endowed with a directed graph. Abbas and Nazir [2] obtained some fixed point 
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results for power graph contraction pair endowed with a graph. Bojor [13] proved 
fixed point theorems for Reich type contractions on metric spaces with a graph. 
For more results in this direction, we refer to [4, 5, 12, 14, 15, 24] and reference 
mentioned therein. 

Beg and Butt [9] proved the existence of fixed points of multivalued mapping in 
metric spaces endowed with a graph G. Recently, Abbas et al. [3] obtained fixed 
points of multivalued mappings satisfying certain graphic contraction conditions 
with set-valued domain endowed with a graph. Nicolae et al. [24] established some 
fixed points of multivalued generalized contractions in metric spaces endowed with 
a graph. 

The aim of this paper is to prove some coincidence point and common fixed point 
results for multivalued graphic y-contractive mappings defined on the family of 
closed and bounded subsets of a metric space endowed with a graph G. These results 
extend and strengthen various comparable results in the existing literature [3, 9, 12, 
19, 20, 23]. 

We denote, the letters R, R* and N denote the set of all real numbers, the set of 
all positive real numbers and the set of all natural numbers, respectively. 

Consistent with Jachymski [19], let (X, d) be a metric space and A denotes the 
diagonal of X x X.Let G be a directed graph, such that the set V(G) of its vertices 
coincides with X and E(G) be the set of edges of the graph which contains all loops, 
that is, A C E(G). Also assume that the graph G has no parallel edges and, thus, 
one can identify G with the pair (V(G), E(G)). 


Definition 1 [19] An operator f : X — X is called a Banach G-contraction or 
simply G-contraction if 


(a) f preserves edges of G; for each x, ye X with (x, y) € E(G), we have 
(f(«), Ff) € E(G), 

(b) f decreases weights of edges of G; there exists a € (0, 1) such that for all 
x,y € X with (x, y) € E(G), we have d(f(x), f(y)) < ad(x, y). 


If x and y are vertices of G, then a path in G from x to y of length k e Nisa 
finite sequence {x,}(n € {0,1,2,...,k}) of vertices such that x9 = x, x, = y and 
(x;-1, x1) € E(G) fori € {1,2,..., k}. 

Notice that a graph G is connected if there is a directed path between any two 
vertices and it is weakly connected if G is connected, where G denotes the undirected 
graph obtained from G by ignoring the direction of edges. Denote by G~! the graph 
obtained from G by reversing the direction of edges. Thus, 


E(G")={@,y)€X x X:(,x) € E(G)}. 


It is more convenient to treat G as a directed graph for which the set of its edges 
is symmetric, under this convention; we have that 


E(G) = E(G)U E(G"'). 
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In V(G), we define the relation R in the following way: 
For x, y € V(G), we have x Ry if and only if, there is a path in G from x to y. If 
G is such that E(G) is symmetric, then for x €¢ V(G), the equivalence class [x]@ in 
V(G) defined by the relation R is V(G,). 
Recall that if f : X — X is an operator, then by Fy we denote the set of all fixed 
points of f. Set 
Xp r= {x eX: (x, f(x)) € E(G)}. 


Jachymski [20] used the following property: 
(P): for any sequence {x,} in X, if x, > x asn > oo and (Xp, Xn41) € E(G), 
then (x,,x) € E(G). 


Theorem 1 [20] Let (X, d) be a complete metric space and G a directed graph such 
that V(G) = X and f : X — X aG-contraction. Suppose that E(G) and the triplet 
(X, d, G) have property (P). Then the following statements hold: 


(i) Fy # Vif and only if X ¢ £ Y; 
(ii) if X 4G and G is weakly connected, then f is a Picard operator, i.e., Fy = 
{x*} and sequence { f"(x)} > x* asn — oo, forall x € X; 
(iii) foranyx € X¥¢, f |[x\g is a Picard operator; 
(iv) if X ¢ CS E(G), then f is a weakly Picard operator, i.e., F ¢ 4 % and, for each 
x € X, we have sequence { f"(x)} > x* € Fr asn > ow. 


For detailed discussion on Picard operators, we refer to Berinde [10, 11]. 
Let (X,d) be a metric space and CB(X) a class of all nonempty closed and 
bounded subsets of X. For A, B € CB(X), let 


H(A, B) = max{supd(b, A), supd(a, B)}, 
beB acA 


where d(x, B) = inf{d(x, b) : b € B} is the distance of a point x to the set B. The 
mapping #7 is said to be the Pompeiu—Hausdorff metric induced by d. 

Throughout this paper, we assume that a directed graph G has no parallel edges 
and G is a weighted graph in the sense that each vertex x is assigned the weight 
d(x, x) = 0 and each edge (x, y) is assigned the weight d(x, y). Since d is a met- 
ric on X, the weight assigned to each vertex x to vertex y need not be zero and, 
whenever a zero weight is assigned to some edge (x, y), it reduces to a loop (x, x) 
having weight 0. Further, in Pompeiu—Hausdorff metric induced by metric d, the 
Pompeiu—Hausdorff weight assigned to each U, V € CB (X) need not be zero (that 
is, H (U, V) € 0) and, whenever a zero Pompeiu—Hausdorff weight is assigned to 
some U, V € CB(X), then it reduces to U = V. 


Definition 2 [3] Let A and B be two nonempty subsets of X. Then by: 


(a) ‘there is an edge between A and B’, we mean there is an edge between some 
a eé€ Aandbe B which we denote by (A, B) C E(G). 
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(b) ‘there is a path between A and B’, we mean that there is a path between some 
ae Aandbe B. 


In CB(X), we define a relation R in the following way: For A, B € CB(X), we 
have ARB if and only if, there is a path between A and B. We say that the relation 
R on CB (X) is transitive if there is a path between A and B, and there is a path 
between B and C, then there is a path between A and C. 

Consider the mapping T : CB(X) — CB(X) instead of a mapping T from X to 
X or from X to CB(X). 

For mappings T : CB (X) — CB(X), the set X7 is defined as 


Xr :={U € CB(X): (U,T U)) € E(G)}. 


Recently, Abbas et al. [3] gave the following definition. 


Definition 3. Let T : CB(X) — CB(X) be a multivalued mapping. The mapping T 
is said to be a graph ¢-contraction if the following conditions hold: 


(i) There is an edge between A and B implies there is an edge between 7 (A) and 
T (B) for all A, B € CB(X). 
(ii) There is a path between A and B implies there is a path between 7 (A) and 
T (B) for all A, B € CB(X). 
(iti) There exists an upper semi-continuous and nondecreasing function @ : Rt > 
Rt with f(t) < t for each t > 0 such that there is an edge between A and B 
implies that 


H (T (A),T (B)) < 6(H(A, B)) forall A,B in CB(X). 


Definition 4 Let S, T : CB(X) — CB(X) be two multivalued mappings. The set 
U € CB(X) is said to be a coincidence point of S and T, if S(U) = T (U). Also, a 
set A € CB(X) is said to be a fixed point of Sif S(A) = A. The set of all coincidence 
points of S and T is denoted by CP (S, T) and the set of all fixed points of S$ is denoted 
by Fix (S). 


Definition 5 Two maps S, T : CB(X) — CB(X) are said to be weakly compatible 
if they commute at their coincidence point. 


For more details to the weakly compatible maps, we refer the reader to 
[121,22], 

A subset I” of CB (X) is said to be complete if for any set X, Y € I’, there is an 
edge between X and Y. 

Abbas et al. [3] used the property P* stated as follows: A graph G is said to have 
property 

(P*): if for any sequence {X,,} in CB(X) with X, — X as n —> o, there exists 
edge between X,,,; and X, for n € N, implies that there is a subsequence {X,,} of 
{X,,} with an edge between X and X,, forn € N. 
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Theorem 2 [3] Let (X, d) be acomplete metric space endowed with a directed graph 
G such that V(G) = X and E(G) D A. If T : CB(X) — CB(X) is a graph ¢- 
contraction mapping such that the relation R on CB (X) is transitive, then following 
statements hold: 


(a) if Fix (T) is complete, then the Pompeiu—Hausdorff weight assigned to the 
U,V € Fix(T) is 0. 

(b) Xr 49 provided that Fix (T) # @. 

(c) If Xr #9 and the weakly connected graph G satisfies the property (P*), then 
T has a fixed point. 

(d) Fix (T) is complete if and only if Fix (T) is a singleton. 


We denote W the set of all functions wy : Rt + R*, where w is nondecreasing 
function with en w” (t) convergent. It is easy to show that if yw € WY, then yw (t) <t 
for any t > 0. We give the following definition. 


Definition 6 Let (X, d) be a metric space endowed with a directed graph G such 
that V(G) = X, E(G) > A and for every U in CB(X), (S(U), U) C E (G) and 
(U,T (U)) C E(G). Let S, T : CB(X) — CB(X) be two multivalued mappings. 
The pair (S, 7) of maps is said to be 


(1) graph y-contraction pair if there exists a yy € W and there is an edge between 
A and B such that 


H (S(A), S(B)) < W(M\(A, B)) 
holds, where 
M,(A, B) = max{H(T (A), T (B)), H(S(A), T (A)), HS (B), T (B)), 


BS) Tr AG BT), 
5 


(II) graph y-contraction pair if there exists a w € W and there is an edge between 
A and B such that 


H (S(A), S(B)) < W(M2(A, B)) 
holds, where 


M,(A, B) = aH (T (A), T (B)) + BH(S(A),T (A)) + yH(S(B), T (B)) 
+6, H(S (A), T (B)) + 62H (S(B), T (A)), 


and a, B, y, 61, 62 = 0, 6) < 62 witha+ B+y+6,+ 8 <1. 


It is obvious that if a pair (S, 7) of multivalued mappings on CB(X) is a graph 
w-contraction or graph y2-contraction for graph G, then pair (S, 7) is also graph 
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¥-contraction or graph w2-contraction respectively, for the graphs G~', G and Go, 
here the graph Go is defined by E(Go) = X x X. 


Definition 7 A metric space (X, d) is called an e-chainable metric space for some 
é > Oif for given x, y € X, there isn € N and a sequence {x,} such that 


Xo =X, X,=y and d(xj-1,x;)<e for i=1,...,n. 


For fixed point result of mappings defined on e-chainable metric space, we refer 
to [9] and references mentioned therein. We also need of the following lemma of 
Nadler [23] (see also [7]). 


Lemma 1 Let (X, d) be a metric space. IfU, V € CB(X) with H(U, V) < &, then 
for each u € U there exists an element v € V such that d(u, v) < €. 


2 Common Fixed Points 


In this section, we obtain coincidence point and common fixed point results for 
multivalued selfmaps on CB(X) satisfying graph y-contraction conditions endowed 
with a directed graph. 


Theorem 3 Let (X,d) be a metric space endowed with a directed graph G such 
that V(G) = X, E(G) D> Aand S,T : CB(X) — CB (X) a graph W-contraction 
pair such that the range of T contains the range of S. Then the following statements 
hold: 


(i) CP (S,T) 4 GY provided that G is weakly connected with satisfies the property 
(P*) and T (X) is complete subspace of CB (X). 
(ii) if CPS, T) is complete, then the Pompeiu—Hausdorff weight assigned to the 
S(U) and S(V) is 0 for allU, V € CP(S,T). 
(iii) if CP (S, T) is complete and S and T are weakly compatible, then Fix (S)M 
Fix (T) is a singleton. 
(iv) Fix (S)O Fix (T) is complete if and only if Fix (S) 0 Fix (T) is a singleton. 


Proof To prove (i), let Ao be an arbitrary element in CB(X). Since range of T contains 
the range of S, chosen A; € CB(X) such that S$ (Ap) = T (A;). Continuing this 
process, having chosen A, in CB (X) , we obtain an A,,+; in CB (X) such that S(x,) = 
T (Xn+1) forn € N. The inclusion (Ay+i1, T (Any1)) C E (G) and (T (Ani), An) = 
(S (An), An) & E (G) implies that (An41, An) C E(G). 

We may assume that $(A,) 4 S(An+1) for all n EN. If not, then S (A2,) = 
S (A241) for some k, implies T (A2,41) = S (A2x41) , and thus Ax,4; € CP (S,T). 
Now, since (Ayn+i1, An) C E (G) for all n € N, and pair (S, T) form a graph y- 
contraction, so we have 
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A(T (An+1), T (An+2)) = A(S (An), S (An+1)) 
= wv (My, (An, An+1)) , 


where 


My (An, An41) = max{H(T (An), T (An41)), H (S (An), T (An)), A (S (An41) . T (An+1)) 
H (S(An),T (Ansi)) + 4 (S (Angi). T (4n)) 
2 
= max{H(T (An), T (An41)), 4 (T (An4i).T (An). 4 (T (An42).T (An+1)); 
A (T (An 1) iT (An )) +H (T (An+2) iT (An) 


< max{H (T (An), T (An41)), H (T (An4i) + T (An42)) 
H(T (Ansa) +7 (Anst)) +H (T (Ang) sT An) 


= max{H (7 (An), T (An+1)) ,H (T (An 1) /T (An 2))}- 
Thus, we have 


H(T (An+1)T (An¢2) S  (max{H (T (An), T (An41)) 4 (7 (An+1) + T (An+2))}) 
= ¥(H (T (An), T (An+1))) 


for alln € N. Therefore fori = 1,2,...,n, we have 


H(T (Aj-1), T (Ai)) 
H(T (Aj-2), T (Ai-1)) 


w(H(Aj-1, Aj)), 
Ww (A (Aj-2, Ai-1)), 


= 
= 


H(T (Ao), T (Ai) < WH (Ao, A1)), 
and so we obtain 
H(T (An), T (Angi) S "(A (Ao, T (A1))) 
for all n € N. Now for m,n € N withm > n > 1, we have 


H (T (An), T (Am)) S H (T (An), T (Anti) + HT (Anti), T (An42)) 
+++ + A(T (An-1),T (Am)) 
< W"(H (Ao, T (A1))) + W"* (Ao, T (A1))) 
+--+ p""'(H(Ao, T (A1))). 


By the convergence of the series Sy w'(H(Ao, T (A1))), we get H (T (A,), 
T (Am)) > 0 as n,m — oo. Therefore {T (A,,)} is a Cauchy sequence in T (X). 
Since (T (X) , d) is complete in CB (X), we have T (A,) — V asn — oo for some 
V € CB(X). Also, we can find U in CB (X) such that T(U) = V. 
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We claim that S(U) = T(U). If not, then since (T (Ay41), T (An)) € E (G) 
so by property (P*), there exists a subsequence {T (An+1)} of {T (An+1)} such 
that (T(U),T (An,+1)) G E(G) for every n EN. As (U,T (U)) © E(G) and 
(T (An,+1) An) = (S (An,) An) Cc E (G) implies that (U, An) Cc E(G). Now 
H(S(U),T (An+i)) = H(S(U),S(An)) < 6 (Mi (U,An)), = @D 


where 


My (U, An,) = max [wr (U),T (An). HW), TW), AG (An) .T (An), 


H(S(U),T (Anj)) + H(S (Ang) T ay 
2. 


= max [wr (U),T (An,)); A(S(U),T (U)), A(T (An, +1) ,T (An). 


H(S(U),T (An)) + H( (Anp+t) > T (UD) | 
; 


Now we consider the following cases: 
If My (U, Ay,) = H(T (U), T (An,)), then on taking limit as k > 00 in (1), we 
have 
H(S(U),TWU)) <w(H(TWU),TU))), 


a contradiction. 
When M; (U, An,) = H(S(U), T (U)), then 


H(S(U),TU)) < W(H(S(WU),TW))), 


gives a contradiction. 
In case M, (U, An) = A(T (An.+1) Pa A (An,)): then on taking limit as k + oo 
in (1), we get 
A(S(U),T(U)) < ~(H(T(U),T (U))), 


a contradiction. 
Finally, if My (U, An,) = 
k + ow, we have 


H(S(U).T (An, )+H(T (Angst). TU ee 
Bex ae en) © then on taking limit as 


H(S(U),T(WU)) +H (TU), TU 
H(S(U), TW) < ¥/ (Sw) wy (TU) wD) 


H(S(U),T(U 
-v( Sw , 


a contradiction. 
Hence § (U) = T (U), thatis, U € CP(S, T). 
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To prove (ii), suppose that CP (S, T) is complete set in G. Let U, V € CP (S, T) 
and suppose that the Pompeiu—Hausdorff weight assign to the S$ (U) and S (V) is 
not zero. Since pair (S, 7) is a graph yw -contraction, we obtain that 


A(S(U),S(V)) < W(M\(U, V)) 
< w(max{H(T (U), T (V)), H(S(U), T U)), H(S(V), T(V)), 
BOWE Or Ost 


2 
= wimax{H (S(U),S(V)), H(S(U), S(U)), H(S(V), T (V)), 
BOW) SU BOWS). 


2 
= (A(S(U),S(V))), 


a contradiction as y (t) < t for all t > 0. Hence (ii) is proved. 

To prove (iii), suppose the set CP (S, T) is weakly compatible. First we are to 
show that Fix (T) M Fix(S) is nonempty. Let W = S(U) = T (U), then we have 
T (W) = TS (U) = ST (U) = S(W), which shows that W € CP (S, T) . Thus the 
Pompeiu—Hausdorff weight assign to the S (U) and S (W) is zero (by ii). Hence W = 
S(W) =T (W), thatis, W € Fix (S) NM Fix (T) . Since CP (S, T) is singleton set, 
implies Fix (S) MN Fix (T) is singleton. 

Finally to prove (iv), suppose the set Fix (S$) M Fix (T) is complete. We are 
to show that Fix (T)M Fix(S) is singleton. Assume on contrary that there exist 
U,V € CB(X) such that U, V € Fix (S)M Fix (T) and U 4 V. By completeness 
of Fix (S) ) Fix (T), there exists an edge between U and V. As pair (S, 7) is a 
graph yj-contraction, so we have 


H(U,V) = H(S(U),S(V)) 
<WwMU,V)) 
= w(max{H(T (UV), T (V)), H(S(U),T U)), H(S(V),T(V)), 
A(S(U),T(V))+ H(S(V),T U)) 


) 
2 
= v(max{H(U, V), H(U,U), H(V,V), 
H(U,V)+ H(V,U) 
5 ) 
=VW(HU,V)), 


a contradiction. Hence U = V. Conversely, if Fix(S)  Fix(T) is singleton, then 
since E(G) D A, so it is obvious that F(S) 1 F(T) is complete set. 


Example I Let X = {1,2,...,n}=V(G),n>2and E(G)={(i,j)e XxX: 
i < j}. Let V (G) be endowed with metric d : X x X — R* defined by 
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H({3},{4}) =4/S 


Fig. 1 Graph G defined in Example 1 


0 ifx=y, 
d(x,y)=4 + ifxe{1,2}withx #y, 
a otherwise. 


Furthermore, the Pompeiu—Hausdorff metric is given by 
1 if A,BC {1,2} with AFB, 


H(A, B) = rar if A or B or both ¢ {1,2} with AF B, 


0 if A=B. 


The Pompeiu—Hausdorff weights (for n = 4) assigned to A, B € CB(X) are 


shown in the Fig. 1. 
Define S, T : CB (X) — CB(X) as follows: 


_f au fuUCt.2, 
S| 1,2), UE 2} 
if U=(1), 
T(U) = {1, 2, 3}, if U C {2, 3}. 
{1,2,...,n}, otherwise. 


Note that, for all V € CB(X), (V, S(V)) © E(G) and (V, T (V)) C E(G). 
Let wy : R, — R, be defined by 
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v(a)= 


It is easy to verify that w € W. Now for all A, B € CB(X) with S(A) 4 S(B), 
we consider the following cases: 


i) If A C {1, 2} and B = {3} with (A, B) C E (G), then we have 


H(S(A), S(B)) = A ({1}, (1, 2)) 


<= ——_ 
2n+1 


=¥(—) 


=v (A ({1, 2}, (1, 2, 3)) 
= W (A(S(B),T (B))) < WMA, B)). 


(ii) When A C {1, 2} and B ¢ {1, 2, 3} with (A, B) C E(G), implies that 


H(S(A), S(B)) = H ({1}, (1, 2)) 


= w (A ({1, 2}, (1, 2, ..., n})) 
= W (A(S(B),T (B))) < WMA, B)). 


(iii) In case A = {3} and B C {1, 2} and with (A, B) C E (G), we have 


H(S (A), S(B)) = H ({1, 2}, (1) 


2n+1 


= n 
-v() 


=v (A ({1, 2}, (1, 2, 3)) 
= W (H(S(A),T (A))) < W(M1(A, B)). 
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(iv) When A G {1, 2, 3} and B C {1, 2} with (A, B) C E (G), implies that 
H(S (A), S(B)) 


A ({1, 2}, {1}) 
1 


n 


n 
<= 
2n+1 


=v(-4) 


= (7 U1, 2),41,2,.0,n)) 
= v (A(S (A), T (A))) s WMA, B)). 


Hence pair (S$, 7) is graph y1-contraction. Thus the conditions of Theorem 3 
hold. Moreover, {1} is the common fixed point of S and T, and Fix (S) 9 Fix (T) 
is complete. 


In the next example we show that it is not necessary that given graph (V (G) , E (G)) 
will always be a complete graph. 


Example 2 Let X = {1,2,...,n} =V(G),n > 2 and 
E(G) = {(1, 1), 2, 2), ..., (m,n), 
(1, 2),...,(,”)}. 


On V (G), the metric d: X x X — R® and Pompeiu—Hausdorff metric H : 
CB (X) — R* are defined as in Example 1. The Pompeiu—Hausdorff weights (for 
n = 4) assigned to A, B € CB (X) are shown in the Fig. 2. 


Fig. 2. Graph G defined in 
Example 2 
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Define S, T : CB (X) — CB(X) as follows: 


_{ Uh #u=(), 
es eae 


_ {1}, if U = {I}, 
i= ne a if U #{l}. 


Note that, (S (A) , A) C E (G) and (A, T (A)) C E (G) for all A € CB(X). 
1 
ly, a €[0, 4] 

Take w (a) = 
ee 


Note that y € W. Now, for all A, B € CB(X) with S (A) 4 S (B), we consider 
the following cases: 


(1) If A = {1} and B F {1}, then we have 


1 
H(S (A), S(B)) = — 


n 


= W (H(S(B),T (B))) < WM (A, B)). 


(1) If A ~ {1} and B = {1}, then we have 


1 
H(S(A),S(B)) = — 


n 


= y (H(S(A),T (A))) < WM (A, B)). 


Hence pair (S, T) is graph w)-contraction. Thus all the conditions of Theorem 3 
are satisfied. Moreover, S and T have a common fixed point and Fix (S$) N Fix (T) 
is complete in CB (X). 


Theorem 4 Let (X,d) be a e-chainable complete metric space for some € > 0 
and S, T : CB(X) — CB(X) be multivalued mappings. Suppose that for all A, B € 
CB (X), 

0 < H(S(A),S(B)) <eé 


and there exists ay € W such 
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AH (S(A), S(B)) < W(M(A, B)), 
hold where 
M(A, B) = max } A(T (A), T (B)), H(S (A), T (A)), H(S (B), T (B)), 


H(S(A),T (B)) + H(S(B), T (A)) 
5 , 


Then S and T have a common fixed point provided that S and T are weakly 
compatible. 


Proof By Lemma 1, from H (A, B) < ¢, wehave foreacha € A, anelementb € B 
such that d(a, b) < ¢. Consider the graph G as V(G) = X and 


E(G) = {(a,b) € X x X:0 <d(a,b) < ¢}. 


Then the e-chainability of (X, d) implies that G is connected. For (A, B) C E(G), 
we have from the hypothesis 


H (S(A), S(B)) < W(M(A, B)), 


where M(A, B) = max}; H(S(A),T (B)), H(S(A), T (A)), H(S (B), T (B)), 


ae ee 
2 


implies that pair (S$, 7) is graph yy; —contraction. 

Also, G has property (P*). Indeed, if {X,,} in CB(X) with X, — X asn > oo 
and (X,, Xn41) C E(G) for n €N, implies that there is a subsequence {X,,} of 
{X,,} such that (Xnes Xx) Cc E(G) forn €N. So by Theorem 3 (iii), § and T have a 
common fixed point. 


Corollary 1 Let (X, d) be a complete metric space endowed with a directed graph 
G such that V(G) = X and E(G) > A. Suppose that the mapping S : CB (X) > 
CB (X) satisfies the following: 


(a) for every V in CB(X), (V,S(V)) C E(G). 
(b) There exists a> 0 such that for w € F there is an edge between A and B implies 
that 
H(S (A), S(B)) < W(Mi(A, B)), 
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where 


M\(A, B) = max j H(A, B), H(A, S(A)), H(B, S(B)), 


H(A, S(B)), H(B, S(A)) ) 
: 


Then following statements hold: 


(i) if Fix (S) is complete, then the Pompeiu—Hausdorff weight assigned to the 
U,V € Fix (S) is 0. 
(ii) If the weakly connected graph G satisfies the property (P*), then S has a fixed 
point. 
(iii) Fix (S) is complete if and only if Fix (S) is a singleton. 


Proof Take T=TJ (identity map) in (I), then Corollary 1 follows from 
Theorem 3. 


Theorem 5 Let (X,d) be a metric space endowed with a directed graph G such 
that V(G) = X, E(G) D Aand S,T : CB(X) — CB(X) a graph W -contraction 
pair such that the range of T contains the range of S. Then the following statements 
hold: 


(i) CP (S,T) 4 GY provided that G is weakly connected with satisfies the property 
(P*) and T (X) is complete subspace of CB (X). 
(ii) if CP (S, T) is complete, then the Pompeiu—Hausdorff weight assigned to the 
S(U) and S (V) is 0 for allU, V € CP (S, T). 
(iii) if CP (S, T) is complete and S and T are weakly compatible, then Fix (S)M 
Fix (T) is a singleton. 
(iv) Fix (S)O Fix (T) is complete if and only if Fix (S) 0 Fix (T) is a singleton. 


Proof To prove (i), let Ao be an arbitrary element in CB(X). Since range of T contains 
the range of S, chosen A; € CB(X) such that S (Ao) = T (Aj). Continuing this 
process, having chosen A, in CB (X) , we obtain an A,,,; in CB (X) such that S(x,) = 
T (Xn41) forn € N. The inclusion (A,41, T (Any1)) © E (G) and (T (An41), An) = 
(S (An), An) & E (G) implies that (An41, An) © E(G). 

We may assume that $(A,) 4 S(An+1) for all n EN. If not, then S (Ax,) = 
S (A241) for some k, implies T (A241) = S (Azg41) , and thus A241 € CP (S, T). 
Now, since (Ani, An) C E (G) for all n € N, and pair (S$, 7) form a graph y- 
contraction, so we have 


A(T (An41), T (An42)) = H(S (An), S (An+1)) 
< w (M2 (An, An+1)) Fy 
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where 


Mp (An, An+1) 

= a@H(T (An), T (Anti) + BH (S (An), T (An) + vA (S (Angi), T (Anti) 
6, (S (An), T (An41)) + 62H (S (Angi), T (An)) 

= @H(T (An), T (Anti) + BA (T (Anyi), T (An)) + yA (T (An4t2) , T (An+1)) 
6, (T (Angi), T (An4i)) + 62H (T (An42), T (An)) 

< (@ + B)H (T (An), T (Anti) + vA OT (Anti), T (An+2)), 
62[H (T (An+2), T (Anti) + A(T (Angi), T (An))] 

= (@ + B+ 52) H(T (An), T (Anti) + (yv + 52) HT (Anti), T (An42))- 


Now, if H (T (An), T (Anyi) S H (T (An+1) , T (An42)), we have 


A(T (An+1) iT (An4+2) <v (max{H es (An), T (An+1)) . (T (An+1) .T (An+2))}) 
= W(H (T (An), T (An41))) 


for alln € N. Therefore fori = 1,2,...,n, we have 


H(T (Aj_-1), T (Aj)) < w(A(Aj-1, Ai), 
H(T (Aj-2), T (Ai-1)) < WCA(Aj-2, Ai-1)), 


H(T (Ao), T (Ai) S$ WC (Ao, A1)), 
and so we obtain 
A(T (An), T (Angi) < W" (A (Ao, T (A1))) 


for alln € N. Follows the similar argument to those in the proof of Theorem 3, we get 
f(T (Ay), T (Am)) ~ Oas n,m — oo. Therefore {T (A,)} is a Cauchy sequence 
in T (X). Since (T (X) , d) is complete in CB (X), we have T (A,) —~ V asn — oo 
for some V € CB(X). Also, we can find U in CB (X) such that T(U) = V. 

We claim that S(U) = T(U). If not, then since (T (Ay41), T (An)) € E (G) 
so by property (P*), there exists a subsequence {T (An.+1)} of {T (An+41)} such 
that (T(U),T (An,+1)) G E(G) for every n € N. As (U,T (U)) © E(G) and 
(T (Ant) » An.) = (S (An) An.) & E (G) implies that (U, Ay,)  E (G). Now 


H(S(U),T (An4i)) = H(S(U), S(An)) < ¥ (Ma (U,An)), (2) 
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M2 (U, An,) = w@H(T (U),T (An,)) + BH(S (U), T (U)) + yH(S (An). T (Any) 
+ 6,H(S(U),T (An,)) + 62H(S (An,) ,T (U)) 
= aH(T (U),T (An,)) + BH(S (U) + T U)) + YT (An, 41) > T (Ang)) 
+6,H(S(U),T (An,)) + 62H (T (An, 41), T (U)). 


On taking limit as k — oo in (2), we have 


A(S(U),TWU)) <s ¥(B+6) H(TU),T U))) 
< H(S(U),T (U)), 


a contradiction. Hence S (U) = T (U), thatis, U € CP(S,T). 

To prove (ii), suppose that CP (S, T) is complete set in G. Let U, V € CP(S, T) 
and suppose that the Pompeiu—Hausdorff weight assign to the S(U) and S(V) is 
not zero. Since pair (S, T) is a graph w2 -contraction, we obtain that 


A(S(U),S(V)) < WU, V)), (3) 
where 


M,(U,V)) =aH(T (U),T(V))+ BH(S(U),T(U)) + yH(S(V),T(V)) 
6,H(S(U),T (V))+2H(S(V),T U)) 
= aH (S(U),S(V)) + BH(S(U),S(U)) + yH(S(V),T(V)) 
= (a+ 6) + 62) H(S(U), S(V)), 


thus 


A(S(U),S(V)) < Wat 5) + 2) H(S(U), S(V))) 
< W(A(S(U),S(V))), 


a contradiction as y (t) < t for all t > 0. Hence (ii) is proved. 

To prove (iii), suppose the set CP (S, T) is weakly compatible. First we are to 
show that Fix (T) M Fix(S) is nonempty. Let W = S$ (U) = T (U), then we have 
T (W) = TS (U) = ST (U) = S (W), which shows that W € CP (S, T) . Thus the 
Pompeiu—Hausdorff weight assign to the S (U) and S (W) is zero (by ii). Hence W = 
S(W) =T (W), thatis, W € Fix (S) MN Fix (T) . Since CP (S, T) is singleton set, 
implies Fix (S) N Fix (T) is singleton. 

Finally to prove (iv), suppose the set Fix (S)M Fix (T) is complete. We are 
to show that Fix (T)M Fix(S) is singleton. Assume on contrary that there exist 
U,V © CB(X) such that U, V € Fix (S)M Fix (T) and U # V. By completeness 
of Fix (S) ) Fix (T), there exists an edge between U and V. As pair (S, T) is a 
graph y2-contraction, so we have 
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H(U,V) = H(S(U),S(V)) 
< ¥(M2UU, V)) 
= W@H(TU),T(V))+ BH(S(U),T U)) + vA(S(V),T (V)) 
+6, H(S(U),T (V)) + 62H(S(V), T (U))) 
= WwaH(U,V)+ BH(U,U)+yA(V, V) +6, HU, V) + 52H (V,U)) 
<W(HUU,V)), 


a contradiction. Hence U = V. Conversely, if Fix(S)  Fix(T) is singleton, then 
since E(G) D A, so it is obvious that F(S) 1 F(T) is complete set. 


Example 3 Let X = Ry = V (G) be endowed with Euclidean metric d. Let f : 


X — X be defined as f (x) = iia and (a, b) € E(G) for some 


acéA,be Bifb= f (a). Define S, T : CB(X) — CB(X) as follows: 


_ | [0,10], if U C [0, 10] 
Uy | [10,20], otherwise, 

_ | (0, 10], if U C [0, 10] 
a | [5,25], otherwise. 


Note that, for all V € CB(X), (S(V), V) C E(G) and (V, T (V)) C E(G). 
Let yy : Ry — R, be defined by 


#r O<t<i 
v(@)= 


5 
zt, 1<t. 


It is easy to verify that w € W. Now for all A, B € CB (X) with S (A) 4 S(B), we 
consider A C [0, 10] and B g [0, 10] with (A, B) C E (G), implies 


A(S (A), S(B)) = H ([0, 10], [10, 20]) 
=10< we 
9 
= W (15a + 58) 


= W (@H ((0, 10], [5, 25]) + yA ([10, 20], [5, 25])) 
= W @H(T (A), T (B)) + yA (S(B),T (B))) < W(M2(A, B)), 


where a = 2, y = 4, B = 6, = 6. = Oand 


M,(A, B) = aH(T (A), T (B)) + BH(S (A), T (A)) + yH(S(B), T (B)) 
+6, H(S (A), T (B)) + 52.H(S (B), T (A)) 
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Hence pair (S, T) is graph w2-contraction. Thus all the conditions of Theorem 5 
are satisfied. Moreover, the set [0, 10] is the common fixed point of S and T, and 
Fix (S) NM Fix (T) is complete. 


The following corollary generalizes and extends [3, Theorem 2.1]. 


Corollary 2 Let (X, d) be acomplete metric space endowed with a directed graph G 
such that V(G) = X and E(G) > A. Suppose that the mappings S, T : CB(X) > 
CB (X) satisfies the following: 


(a) for every V in CB(X), (S(V), V) C E(G) and (V,T (V)) C E(G). 
(b) There exists w € W such that for all A, B € CB(X) with there is an edge 
between A and B implies 


H(S (A), S(B)) < W(@H(T (A), T (B)) + BH(S (A), T (A)) + y H(S(B), T (B))) 


hold, where a, B, y are nonnegative real numbers witha + B+ y < 1. If the 
range of T contains the range of S , then the following statements hold: 


(i) CP (S,T) 4 GY provided that G is weakly connected with satisfies the property 
(P*) and T (X) is complete subspace of CB (X). 
(ii) if CP (S, T) is complete, then the Pompeiu—Hausdorff weight assigned to the 
S(U) and S (V) is 0 for allU, V € CP (S, T). 
(iii) if CP (S, T) is complete and S and T are weakly compatible, then Fix (S)M 
Fix (T) is a singleton. 
(iv) Fix (S)O Fix (T) is complete if and only if Fix (S) 0 Fix (T) is a singleton. 


Corollary 3 Let (X, d) be a complete metric space endowed with a directed graph 
G such that V(G) = X and E(G) > A. Suppose that the mappings S : CB (X) > 
CB (X) satisfies the following: 


(a) for every V in CB(X), (S(V),V) C E(G). 
(b) There exists w € W such that for all A, B € CB(X) with there is an edge 
between A and B implies 


H(S (A), S(B)) < W@H(A, B) + BH(S(A), A) + yH(B, S(B))) 


hold, where a, B, y are nonnegative real numbers witha + B+ y < 1. Then 
the following statements hold: 


(i) if Fix (S) is complete, then the Pompeiu—Hausdorff weight assigned to the 
U,V € Fix (S) is 0. 
(ii) If the weakly connected graph G satisfies the property (P*), then S has a fixed 
point. 
(iii) Fix (S) is complete if and only if Fix (S) is a singleton. 


Proof If we take T = J (identity map) in Corollary 2, the result follows. 
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Remark 1 (1) If E(G) := X x X,thenclearly G is connected and our Theorem 2.1 
improves and generalizes Theorem 2.1 in [3], Theorem 2.1 in [9], and Theorem 
3.1 in [20]. 

(2) If E(G) := X x X, then clearly G is connected and our Theorem 2.4 extends 
and generalizes Theorem 2.5 in [9], Theorem 3.2 in [23], Theorem 5.1 in [16] 
and Theorem 3.1 in [20]. 

(3) If E(G) := X x X, then clearly G is connected and our Corollary 2.5 improves 
and generalizes Theorem 2.1 in [9], Theorem 3.2 in [23] and Theorem 3.1 in 
[20]. 


3 Conclusion 


Jachymski and Jozwik initiated the study of ordered structured metric fixed point the- 
ory by using the ordered structured with a graph structure on a metric space. Recently 
many results appeared in the literature giving the fixed point problems of mappings 
endow with graph. We presented the common fixed points of a class of multivalued 
maps with set-valued domain that are only commuting at their coincidence points 
endow with a directed graph. We presented some examples to show the validity of 
obtained results. 
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Common Fixed Point Results for Family 
of Generalized Multivalued F-Contraction 
Mappings in Ordered Metric Spaces 


Talat Nazir and Sergei Silvestrov 


Abstract In this paper, we study the existence of common fixed points of family 
of multivalued mappings satisfying generalized F'-contractive conditions in ordered 
metric spaces. These results establish some of the general common fixed point the- 
orems for family of multivalued maps. 


Keywords Common fixed point - Multivalued mapping - F-contraction - Ordered 
metric space 


1 Introduction and Preliminaries 


Markin [16] initiated the study of fixed points for multivalued nonexpansive and 
contractive maps. Later, a useful and interesting fixed point theory for such maps 
was developed. Later, a rich and interesting fixed point theory for such multival- 
ued maps was developed; see, for instance [6, 7, 9-11, 14, 15, 18-20, 23]. The 
theory of multivalued maps has various applications in convex optimization, dynam- 
ical systems, commutative algebra, differential equations and economics. Recently, 
Wardowski [25] introduced a new contraction called F-contraction and proved a 
fixed point result as a generalization of the Banach contraction principle. Abbas et 
al. [3] obtained common fixed point results by employed the F-contraction condi- 
tion. Further in this direction, Abbas et al. [4] introduced a notion of generalized 
F-contraction mapping and employed there results to obtain a fixed point of a gener- 
alized nonexpansive mappings on star shaped subsets of normed linear spaces. Minak 
et al. [17] proved some fixed point results for Ciric type generalized F'-contractions 


T. Nazir (BX) - S. Silvestrov 

Division of Applied Mathematics, School of Education, Culture and Communication, 
Malardalen University, Box 883, 721 23 Visteras, Sweden 

e-mail: talat@ciit.net.pk 


T. Nazir 

Department of Mathematics, COMSATS Institute of Information Technology, 
Abbottabad 22060, Pakistan 

e-mail: sergei.silvestrov @ mdh.se 


© Springer International Publishing Switzerland 2016 419 
S. Silvestrov and M. Ranéié (eds.), Engineering Mathematics II, 

Springer Proceedings in Mathematics & Statistics 179, 

DOI 10.1007/978-3-319-42105-6_20 


420 T. Nazir and S. Silvestrov 


on complete metric spaces. Recently, [5] established some fixed point results for 
multivalued F-contraction maps on complete metrics spaces. 

The aim of this paper is to prove common fixed points theorems for family of 
multivalued generalized F-contraction mappings without using any commutativity 
condition in partially ordered metric space. These results extend and unify various 
comparable results in the literature [12, 13, 21, 22]. 

We begin with some basic known definitions and results which will be used in the 
sequel. Throughout this article, N, R+, R denote the set of natural numbers, the set 
of positive real numbers and the set of real numbers, respectively. 

Let F be the collection of all mappings F : Rt > R that satisfy the following 
conditions: 


(F;) F is strictly increasing, that is, for all a, b ¢ R* such that a < b implies that 


F(a) < F(b). 
(F)) For every sequence {a,} of positive real numbers, lim a, =O and lim 
noo n->oo 
F (a,) = —o are equivalent. 


(F3) There exists 4 € (0, 1) such that Ton a* F(A) = 0. 


Definition 20.1 ((25]) Let (X, d) bea metric space and F € F. Amapping f : X > 
X is said to be an F-contraction on X if there exists t > 0 such that d( fx, fy) > 0 
implies that 

t+ F(d(fx, fy)) < F (d(, y)) 


for all x, y € X. 
Wardowski [25] gave the following result. 


Theorem 20.1 Let (X, d) be a complete metric space and mapping f : X — X be 
and F —contraction. Then there exists a unique x in X such that x = fx. Moreover, 
for any xo € X, the iterative sequence Xn+1 = f (X,) converges to x. 


Kannan [12] has proved a fixed point theorem for a single valued self mapping T of 
a metric space X satisfying the property 


d(Tx, Ty) < h{d(x, Tx) + d(y, Ty)} 


for all x, y in X and for a fixed where h € [0, 5). 
Ciric [8] considered a mapping T : X — X satisfying the following contractive 
condition: 


d(Tx, Ty) < qmax{d(x, y), d(x, Tx), d(y, Ty), d(x, Ty), d(y, Tx)}, 


where q € [0, 1). He proved the existence of a fixed point when X is a T-orbitally 
complete metric space. 

Latif and Beg [13] extended mappings considered by Kannan to multivalued map- 
pings and introduced the notion of a K-multivalued mapping. Rus [21] coined the 
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term R-multivalued mapping, which is a generalization of a K-multivalued map- 
ping (see also, [2]). Abbas and Rhoades [1] studied common fixed point problems 
for multivalued mappings and introduced the notion of generalized R-multivalued 
mappings which in turn generalizes R-multivalued mappings. 

Let (X,d) be a metric space. Denote by P(X) be the family of all nonempty 
subsets of X, and by P,; (X) the family of all nonempty closed subsets of X. 

A point x in X is called fixed point of a multivalued mapping T : X > P(X) 
provided x € Tx. The collection of all fixed point of T is denoted by Fix(T). 

Recall that, a map T : X — P.; (X) is said to be upper semicontinuous, if for 
xX, € X and y, € Tx, with x, — xo and y, — yo, implies yo € TXxo (see [24]). 


Definition 20.2 Let X be a nonempty set. Then (X, d, <) is called partially ordered 
metric space if and only if d is a metric on a partially ordered set (X, <). 


We define A,;, A> C X x X as follows: 


A, ={(x%,y)EeXxXixX~ y}, 
Ao = {(x, vy) © X x X 1x ~ y}. 


Definition 20.3 A subset I” of a partially ordered set X is said to be well-ordered if 
every two elements of I” are comparable. 


2 Common Fixed Point Theorems 


In this section, we obtain common fixed point theorems for family of multivalued 
mappings. We begin with the following result. 


Theorem 20.2 Let (X,d,~<) be an ordered complete metric space and {T;}"_, : 
X — P.(X) be family of multivalued mappings. Suppose that for every (x, y) € Aj 
and ux € T;(x), there exists uy € T;4,(y) fori € {1,2,..., m} (with Tn41 = T, by 


convention) such that, (uy, Uy) € Az implies 
t + F (d(ux, uy)) < F(M(x, y; ux, uy), (1) 
where Tt is a positive real number and 


(ae uy) + d (y, ux) 
5 . 


d 
M(x, y; Ux, Uy) = max } d(x, y), d(x, ux), d(y, uy), 


Then the following statements hold: 


(i) Fix(T;) #@ for any i € {1,2,...,m} if and only if Fix(T,) = Fix(Th) = 
+++ = Fix(Tn) #O. 
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(ii) Fix(T\) = Fix(Th) =--+- = Fix(T,,) 4 @ provided that any one T; for i € 
{1,2,..., m} is upper semicontinuous. 
(iii) A", Fix(T;) is well-ordered if and only if N7_, Fix(T;) is singleton set. 


i= i= 


Proof To prove (i), let x* € T,(x*) for any k € {1,2,...,m}. Assume that x* ¢ 
Ty+1 (x*), then there exists an x € 741 (x*) with (x*, x) € A> such that 


TH+ F (d(x*, x)) < F(M(x*,x*;x*,x)), 


where 


d(x*, x) + d(x*, x*) 
2 


M(x*, x"; x", x) = max jd(x*, x"), d(x*, x"), d(x, x"), 


= d(x, x"), 


implies that 
t+ F (d(x*,x)) < F(d(x*,x)), 


a contradiction as t > 0. Thus x* = x. Thus x* € Ty.) (x*) and so Fix(T,) € 
Fix(T41). Similarly, we obtain that Fix(7x41) C Fix(Tk42) and continuing this 
way, we get Fix(T,) = Fix(Ih) =--- = Fix(T,). The converse is straightforward. 

To prove (11), suppose that x9 is an arbitrary point of X. If xo € Tj, (xo) for any 
ko € {1, 2,..., m}, then by using (i), the proof is finished. So we assume that xo ¢ 
Tk. (Xo) for any ko € {1, 2,..., m}. Now fori € {1,2,..., mb}, if x; € T;(xo), then 
there exists x2 € 7}41(x,) with (x1, x2) € Az such that 


t + F (d(x1, X2)) < F(M (x0, x15 41, X2)), 
where 


A(X, X2) +dQ1, *1) 
2 


M (Xo, X13 X1, X2) = max {a0 X1), A(x, %1), d(X1, x2), 


d(X0, X2) 


= max | (an a)e dCs. 20) 5) 


= max{d(xo, x1), d(x1, X2)}. 
Now, if M(x0, x13 X1, X2) = d(x, x2) then 
t+ F (d(x, x2)) < F(d(x1, x2)), 
a contradiction as t > 0. Therefore M (xo, x1; x1, X2) = d(Xo, x;) and we have 
t+ F (d(x1, x2)) < F (d(x, *1)). 


Next for this x2 € T;1, (x), there exists x3 € Tj12(%2) with (x2, x3) € A> such that 
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t+ F (d(X2, x3)) < F(M (x1, x25 X2, x3)), 


where 


d(x, X3) + d(X2, x2) 


M(x, X23 X2, X3) = max d(x, x2), d(x, x2), d(x2, x3), 5 


= max{d (x1, x2), d(x2, x3)}. 
Now, if M(x1, x23 x2, x3) = d(x2, x3) then 
t+ F (d(xo, x3)) = F(a; x3), 
a contradiction as t > 0. Therefore M(x), x2; x2, x3) = d(x,, x2) and we have 
t+ F (d(x2, x3)) < F (d(x, x2)). 


Continuing this process, for x2, € T;(x2,—1), there exist x2,+1 € Ti41 (von) with 
(X2n, X2n+1) € Az such that 


t + F (d(x2n, Xan41)) < F (M (X2n-1, X2n5 Xan, X2n41)) 5 


where 


M (Xon—1, X2n3 X2n, X2n41) = Max {can X2n)5 A(X2n—15 X2n), U(X2n, X2n41), 


d(X2n-1, X2n+1) + d(X2n, X2n) 
2 


d(X2n-1 ry Xon41) 


= {A¢sor-1, Xn), d(X2n, Xon41)s 5 


= d(X2n-1 > Xan), 


that is, 
t+ F (d(xa, X2n41)) < F (d(X2n-1, X2n)) - 


Similarly, for XIn+1 © Tj41(X2n), there exist Xon4+2 © Tj+2 (X2n41) such that for 
(X2n41, X2n42) € Az implies 


t+ F (d(X2n41, X2n42)) S F (d(X2n, X2n41)) - 


Hence, we obtain a sequence {x,} in X such that for x, € T;(xn-1), there exist 
Xni © Tj41 (%,) with (x,, Xn41) € Az such that 


t+ F d(Xn, Xn41)) S F (d(n-1, Xn)) - 
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Therefore 


E (d(xn, Xn+1)) < F (d(Xn-1, Xn)) —T < F (d(Xn_2, Xn—-1)) —2t 


<-++ < F d(xo, %1)) — nt. (2) 
From (2), we obtain lim F (d(%p, %,+1)) = —oo that together with (F2) gives 
noo 


lim d(%p, X41) = 0. 
n—-> oo 


From (F3), there exists 4 € (0, 1) such that 


lim [d(Xn, Fesg i F (d(xn, Xn41)) = 0. 


n—- oo 


From (2), we have 


[d(Xn, el F (d(xn, Xn+1)) = [d(Xn, ee F (d(xo, Xn41)) 
< -nt[d(an, x4)" < 0. 


On taking limit as n — oo we obtain 


lim nfd (Xn, Xn41))* = 0. 
noo 


Hence lim nid(Xp, Xn41) = Oand there exists n, € N such that nid (Xp, Xn41) < 
noo 
1 for all n > n,. So we have 


1 
nile 


d(Xn, Xn41) < 
for all n > n,;. Now consider m,n € N such that m > n > nj, we have 
d (Xn, Xm) < d (Xn Xn41) oe d (Xn41; Xn42) ara d (Xm—1; Xm) 


| 
= > pla’ 
i=n 


By the convergence of the series ben aT we get d (X,,Xm) > Oasn,m — ov. 
Therefore {x,} is a Cauchy sequence in X. Since X is complete, there exists an 
element x* € X such that x, — x* asn > oo. 


Now, if 7; is upper semicontinuous for any i € {1,2,...,m}, then as x2, € X, 
Xon41 © Tj (Xan) with x2, > x* and x24; > x* as n — oo implies that x* € 
T; (x*) . Thus from (i), we get x* € T; (x*) = Ty (x*) = +++ = Ty (X*). 


Finally to prove (iii), suppose the set M?"_, Fix (7;) is a well-ordered. We are to 
show that 17, Fix (7;) is singleton. Assume on contrary that there exist uw and v such 
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that u,v € N", Fix (7;) but u ¢ v. As (u, v) € Ad, so for (ux, vy) € Az implies 


t+ F (d(u,v)) < F(M(u, v; u, v)) 
= F( aman {aw. v),d(u, u), d(v, v), sone) 


=F(dtu,v)), 


acontradiction as t > 0. Hence u = v. Conversely, if N7_, Fix (7;) is singleton, then 
it follows that Ni", Fix (T7;) is a well-ordered. 


The following corollary extends and generalizes Theorem 4.1 of [13] and Theorem 
3.4 of [21] for two maps in ordered metric spaces. 


Corollary 20.1 Let (X,d, <) be an ordered complete metric space and T,, Ty : 
X — P.(X) be two multivalued mappings. Suppose that for every (x, y) € A, and 
ux € T;(x), there exists uy € T;(y) fori, j € {1,2} withi A j such that, (ux, uy) € 
A> implies 

t + F (d(ux, uy)) < F(M(x, y; ux, uy)), (3) 


where Tt is a positive real number and 


(a uy) +d (y, ux) 


d 
M(x, y; ux, uy) = max d(x, y), d(x, ux), d(y, uy), 2 


Then the following statements hold: 


(1) Fix(T;) £4 @ for any i € {1, 2} ifand only if Fix(T,) = Fix(Th) 4 @. 

(2) Fix(T,) = Fix(Th) 4 @ provided that T, or Ty is upper semicontinuous. 

(3) Fix(T) O Fix (1p) is well-ordered if and only if Fix(T,) N Fix (74) is singleton 
set. 


Example 20.1 Let X = {x, = "4*¥ :n € {1, 2,3, ...}} endow with usual order < . 
Let 


A, = {(x, y):x <y where x,y € X} and 
Ao = {(x, y):x <y where x,y € X}. 


Define 7,, T, : X — P.,(X) as follows: 


T; (x) = {x1} for x € X, 


{x1} , Xx =x 


LG) = {x1,Xn-1}, X =X, for n> 1. 
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Take F (a) = Ina+a, a > 0 and t = 1. For a Euclidean metric d on X, and 
(ux, uy) € A>, we consider the following cases: 


G@) Ifx =x1,y =x, for m > 1, then for u, = x; € T; (x), there exists uy, = 
Xm—1 € Ty (y), such that 


d(ux, Uy ed erty) —M (Xr Yitssty) < d(ux, By )ete)-4ey) 
2 
= m 2 _ 2o-m 
m2 +m—2 = 
So —— —e 
2 
=e !d (x, y) 


< e 'M (x, y3 Ux, iy) ; 


(ii) Ifx = xX,, y = Xn41 withn > 1, then for u, = x; € Ti (x), there exists uy = 
Xn-1 € Tr (y), such that 


] 


d(u u Jed Ursty)—M (x, yitts ty) d(etty dy uz) 
x, Uy 2 


IA 


d(ux, Uy etal 


n —n—2 ~3n-2 
bo) 


= ———e 2? 
2 
n> +4n 4 
< a 
_ «(4 (x, uy) +d (y, aay 
2 


<e!'M (36, Py toys 4) 


(ii) Whenx = x,, y = X» withm > n > 1, thenforu, = x; € Tj; (x), there exists 
Uy = X,-1 € Ty (y), Such that 


d (uy, uy ele -M Vit) < (uy, uy eleorty deus) 
2 
_ n> —n— ee 
2 
n+n—2 4 
< ————e 
2 
= e'd (x, ux) 


SoM (ti tgcils) 


Now we show that for x, y € X, ux € T) (x); there exists uy € T, (y) such that 
(ux, u ») € Ay, and (3) of Corollary 20.1 is satisfied. For this, we consider the fol- 
lowing cases: 
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G) Ifx=2x,, y= x, with n > 1, we have for uy = x,-1 € To (x), there exists 
Uy = x; € T; (y), such that 


d(ux, Uy et tres) —M Cysts ty) < d(ux, Uy et Herts) —d ey) 
2 
_ nan 26-0 
2 
2 
n+n—-2 _, 
< ——e 
2 
=e 'd (x, y) 


<e'M (x, yY3 Uy, uy) : 


(ii) In case x = x,, Y=Xm With m >n > 1, then for uw, = x,-1 € To (x), there 
exists Uy = x, € T> (y), such that 


d(ux, Uy et erty) —M (HYitterty) < d(ux, uy et errts)—4(youy) 


2 
_ no —n— 2 m—n—m2—m 


2 
m2 +m—2 4 
< a Sai 
=e ld (y, uy) 


< e'M im, 3 Ux, uy) ‘ 


Hence all the conditions of Corollary 20.1 are satisfied. Moreover, x; = 1 is the 
unique common fixed point of 7; and 7 with Fix(T,) = Fix(1). 

The following result generalizes Theorem 3.4 of [21] and Theorem 3.4 of [22]. 
Theorem 20.3 Let (X,d,~<) be an ordered complete metric space and {T;}"_, : 
X — P.i(X) be family of multivalued mappings. Suppose that for every (x, y) € A 
and ux € T;(x), there exists uy € T;4,(y) fori € {1,2,..., m} (with Tn41 = T; by 
convention) such that, (ux, Uy) € Az implies 


t+ F (d(uy, uy) = Fa, 93 ts), (4) 
where T is a positive real number and 
My(x, y; Ux, Uy) = ad(x, y) + Bd(x, ux) + yd(y, uy) + d1d (x, uy) + 2d (y, ux), 


and a, B, y, 61, 62 => 0, 6; < 62 witha+6B+y+6, +462 < 1. Then the following 
statements hold: 


(I) Fix(T;) #4 @ for any i € {1,2,...,m)} if and only if Fix(T,) = Fix(Th) = 
+++ = Fix(Tyn) #O. 
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(II) Fix(T\) = Fix(Th) =--++ = Fix(T,) 4% provided that any one T; for i € 
{1,2,..., m} is upper semicontinuous. 
(IT) OV", Fix(T;) is well-ordered if and only if O7_, Fix(T;) is singleton set. 


Proof To prove (I), let x* € 7, (x*) for any k € {1,2,...,m}. Assume that x* ¢ 
Ty+41 (x*), then there exists an x € Ty44 (x*) with (x*, x) € A> such that 


t+ F (d(x*,x)) = F(Ma(x*, x*; x", x), 
where 


Mo(x*, x*3 x*, x) = ad(x*, x*) + Bd(x*, x*) + yd(x, x*) 
+ dd(x*, x) + dad (x*, x*) 
=(y +6)d@, x*), 


implies that 


t+ F (d(x*,x)) < Fy + 81)d(x*, x)) 
<= FG", x)), 


a contradiction as t > 0. Thus x* = x. Thus x* € Ty.) (x*) and so Fix(T,) C 
Fix(T41). Similarly, we obtain that Fix(7x41) C Fix(Tk42) and continuing this 
way, we get Fix(T,) = Fix(Ih) =--- = Fix(T,). The converse is straightforward. 

To prove (ID), suppose that xo is an arbitrary point of X. If xo € T,, (xo) for 


any ko € {1,2,..., m}, then by using (1), the proof is finishes. So we assume that 
Xo ¢ Th. (xo) for any ko € {1,2,..., m}. Now for7 € {1, 2,..., m}, if.x1 € T; (xo), 
then there exists x2 € 7j41(x1) with (x1, x2) € Az such that 

t + F (d(x1, x2)) < F(M2 (Xo, x1; X1, x2)), 


where 


M2(x0, X13 X1, X2) = ad (xo, x1) + Bd(x0, x1) + yd(x1, x2) 
+ 8d (Xo, X2) + bod (x1, x1) 
< (a+ B+ 6))d(xo, x1) + (vy + 61)d(1, x2). 


Now, if d(xo, x1) < d(x 1, x2), then we have 


t+ F (d(x, x2)) < F(a + B+ y + 26))d(m1, x2)) 
< F(d(x1, x2)), 


a contradiction. Therefore 


t + F (d(x, x2)) < F (x0, *1))- 
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Next for this x2 € 7+; (x1), there exists x3 € 7Tj42(x2) with (x2, x3) € Az such that 
t + F (d(x, x3)) < F(M2 (x1, x2; x2, 3)), 

where 


Mo(x1, X23 X2, x3) = ad (x1, x2) + Bd(x1, X2) + yd (x2, x3) 
+ 8d (x1, x3) + 2d (x2, x2) 
< (@+ 6 +6))d(x1, x2) + (¥ + 461)d (x2, x3). 


Now, if d(x1, x2) < d(x2, x3) then 


t+ F (d(x2, x3)) < F(a + Bt y + 26))d(X, x3)) 
< Fd (x, x3)), 


a contradiction as t > 0. Therefore 
t + F (d(x2, %3)) < F (d(%1, x2)). 


Continuing this process, for x2, € T;(x2,_1), there exist X2,41 € Tj41 (%2,) with 
(X2n, X2n41) € Az such that 


TE F (d(X2n, Xon+1)) = F (M2 (X2n-1, X2n; X2n, X2n41)) , 
where 


M3 (X2n—1 X2n3 X2ny X2n41) = AA (X2n-1, X2n) + BA X2n—1, X2n) + YA (Xan, X2n+1) 
+ 81d (X2n—1, X2n41) + 52d (X2n, X2n) 
S (@+ B+ 461) d(Xon—1, Xan) + (VY + 41) dn, X2n41) 
S d(Xan-1, Xan), 


that is, 
tr F (d(x2n, Xon41)) = F (d(Xon-1, X2n)) : 


Similarly, for x2n41 € Ti41(%2n), there exist x2n+2 € Ti+2 (¥2n41) such that for 
(Xon41; Xon42) € Ad implies 


t+ F (d(X2n41, X2n42)) S F (d(X2n, X2n41)) - 


Hence, we obtain a sequence {x,} in X such that for x, € T;(xn-1), there exist 
Xn © Tj41 (Xn) with (%, X41) € Az such that 


t+ F (d(Xn, Xn41)) S F (d(n-1, Xn)) - 
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Therefore 


E (d (Xn, Xn41)) < F (d(Xn-1, Xn)) —T < F (d(Xn_2, Xn—-1)) —2t 
<--> < F (xo, 1)) — nt. (5) 


From (4), we obtain lim F (d(%p, %,+41)) = —oo that together with (F2) gives 
noo 
lim d(Xn, Xn41) = 0. 
n—->Co 


Follows the arguments those in proof of Theorem 20.2, {x,,} is a Cauchy sequence 
in X. Since X is complete, there exists an element x* € X such that x, — x* as 
n—> o. 


Now, if 7; is upper semicontinuous for any i € {1,2,...,m}, then as x2, € X, 
Xons1 € T; (X2n) with x2, > x* and x24; > x* as n— oo implies that x* € 
T; (x*) . Thus from (1), we get x* € T, (x*) = Ty (x*) =--- = Ty (x*). 


Finally to prove (III), suppose the set 7, Fix (7;) is a well-ordered. We are to 
show that 17_, Fix (7;) is singleton. Assume on contrary that there exist u and v such 
that u,v € Ni, Fix (7;) but u ¢ v. As (u, v) € Ao, so for (ux, vy) € Az implies 


t+ F (du, v)) < F(M2Uu, v; u, v)), 


where 
Mou, v3 Uy, v) — ad(u, v) + Bdtu, u) of yd(v, v) 
+ d)d (u, v) + dod (v, u) 
— (a + 6, +462)d (x,y), 
that is, 


t+ F (dtu, v)) < F (a +6) + 62)d (x, y)) 
< F(du,y)), 


acontradiction as t > 0. Hence u = v. Conversely, if N7_, Fix (7;) is singleton, then 
it follows that N_, Fix (7;) is a well-ordered. 


The following corollary extends Theorem 3.1 of [21], in the case of family of 
mappings in ordered metric space. 


Corollary 20.2 Let (X,d, <) be an ordered complete metric space and {T;}'"_, : 
X — P.i(X) be family of multivalued mappings. Suppose that for every (x, y) € A 
and ux € T;(x), there exists uy € T;4,(y) fori € {1,2,..., m} (with Tn41 = T; by 
convention) such that, (ux, Uy) € Az implies 


t+ F (d(ux,uy)) < F(ad (x, y) + Bd(x, ur) + yd(y, uy), (6) 
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where Tt is a positive real number and a, B, y => 0 with a, B, y < 1. Then the con- 
clusions obtained in Theorem 20.3 remains true. 


The following corollary extends Theorem 4.1 of [13]. 


m 


Corollary 20.3 Let (X,d, <) be an ordered complete metric space and {T;}'"_, : 
X — P.(X) be family of multivalued mappings. Suppose that for every (x, y) € A 
and uy € T;(x), there exists uy € Tj4,(y) fori € {1,2,...,m} (with T4, = T, by 
convention) such that, (ux, Uy) € Az implies 


t+ F (d(x, uy)) < Fld (x, ux) + d(y, uy)]), (7) 


where Tt is a positive real number and h € [0, 5]. Then the conclusions obtained in 
Theorem 20.3 remain true. 


Corollary 20.4 Let (X,d, <) be an ordered complete metric space and {T;}'_, : 
X — P.i(X) be family of multivalued mappings. Suppose that for every (x, y) € A 
and ux € T;(x), there exists uy € T;4,(y) fori € {1,2,..., m} (with Tn41 = T, by 
convention) such that, (uy, Uy) € Az implies 


t + F (d(x, uy)) < FQ, y)), (8) 


where t is a positive real number. Then the conclusions obtained in Theorem 20.3 
remain true. 


The above corollary extends Theorem 4.1 of [13]. 


3 Conclusion 


Recently many results appeared in the literature giving the problems related to the 
common fixed point for multivalued maps. In this paper we obtained the results for 
existence of common fixed points of family of maps that satisfying generalized F- 
contractions in ordered structured metric spaces. We presented some examples to 
show the validity of established results. 


Acknowledgements Talat Nazir and Xiaomin Qi are grateful to the Erasmus Mundus project 
FUSION for supporting the research visit to Malardalen University, Sweden, and to the Research 
environment MAM in Mathematics and Applied Mathematics, Division of Applied Mathemat- 
ics, the School of Education, Culture and Communication of Malardalen University for creating 
excellent research environment. 


432 T. Nazir and S. Silvestrov 


References 


1. Abbas, M., Rhoades, B.E.: Fixed point theorems for two new classes of multivalued mappings. 
Appl. Math. Lett. 22, 1364-1368 (2009) 
2. Abbas, M., Khan, A.R., Nazir, T.: Common fixed point of multivalued mappings in ordered 
generalized metric spaces. Filomat. 26(5), 1045-1053 (2012) 
3. Abbas, M., Ali, B., Romaguera, S.: Fixed and periodic points of generalized contractions in 
metric spaces. Fixed Point Theory Appl. 243, 1-11 (2013) 
4. Abbas, M., Ali, B., Romaguera, S.: Generalized contraction and invariant approximation results 
on nonconvex subsets of normed spaces. Abstr. Appl. Anal. 2014 Article ID 391952, 1-5 (2014) 
5. Acar, O., Durmaz, G., Minak, G.: Generalized multivalued F—contraction on complete metric 
spaces. Bull. Iranian Math. Soc. 40(6), 1469-1478 (2014) 
6. Al-Thagafi, M.A., Shahzad, N.: Coincidence points, generalized /— nonexpansive multimaps 
and applications. Nonlinear Anal. 67, 2180-2188 (2007) 
7. Berinde, M., Berinde, V.: On general class of multivalued weakly Picard mappings. J. Math. 
Anal. Appl. 326, 772-782 (2007) 
8. Ciri¢é, L.: Generalization of Banach’s contraction principle. Proc. Amer. Math. Soc. 2(45), 
267-273 (1974) 
9. Cirié, L.: Multi-valued nonlinear contraction mappings. Nonlinear Anal. Theory Method Appl. 
71(7-8), 2716-2723 (2009) 
10. Jungck, G., Rhoades, B.E.: Fixed points for set valued functions without continuity. Indian J. 
Pure Appl. Math. 29, 227-238 (1998) 
11. Kamran, T.: Multivalued f—weakly Picard mappings. Nonlinear Anal. 67, 2289-2296 (2007) 
12. Kannan, R.: Some results on fixed points. Bull. Calcutta. Math. Soc. 60, 71-76 (1968) 
13. Latif, A., Beg, I.: Geometric fixed points for single and multivalued mappings. Demonstratio 
Math. 30(4), 791-800 (1997) 
14. Latif, A., Tweddle, I.: On multivalued nonexpansive maps. Demonstratio. Math. 32, 565-574 
(1999) 
15. Lazar, T., O’Regan, D., Petrusel, A.: Fixed points and homotopy results for Ciric-type mullti- 
valued operators on a set with two metrices. Bull. Korean Math. Soc. 45(1), 67-73 (2008) 
16. Markin, J.T.: Continuous dependence of fixed point sets. Proc. Amer. Math. Soc. 38, 545-547 
(1973) 
17. Minak, G., Helvasi, A., Altun, I.: Ciric type generalized F—contractions on complete metric 
spaces and fixed point results. Filomat. 28(6), 1143-1151 (2014) 
18. Mot, G., Petrusel, A.: Fixed point theory for a new type of contractive multivalued operators. 
Nonlinear Anal. Theory Method Appl. 70(9), 3371-3377 (2009) 
19. Nadler, S.B.: Multivalued contraction mappings. Pacific J. Math. 30, 475-488 (1969) 
20. Rhaodes, B.E.: On multivalued f—nonexpansive maps. Fixed Point Theory Appl. 2, 89-92 
(2001) 
21. Rus, LA., Petrusel, A., Sintamarian, A.: Data dependence of fixed point set of some multivalued 
weakly Picard operators. Nonlinear Anal. 52, 1944-1959 (2003) 
22. Sgroi, M., Vetro, C.: Multi-valued F—contractions and the Solution of certain functional and 
integral equations. Filomat. 27(7), 1259-1268 (2013) 
23. Sintunavarat, W., Kumam, P.: Common fixed point theorem for cyclic generalized multi-valued 
contraction mappings. Appl. Math. Lett. 25(11), 1849-1855 (2012) 
24. Turkoglu, D., Binbasioglu, D.: Some fixed-point theorems for multivalued monotone mappings 
in ordered uniform space. Fixed Point Theory Appl. 2011 Article ID 186237, 1-12 (2011) 
25. Wardowski, D.: Fixed points of new type of contractive mappings in complete metric spaces. 
Fixed Point Theory Appl. 94, 1-6 (2012) 


Index 


A 

Algebra of piece-wise constant functions, 
78, 98 

Algebraic dependence, 58, 67, 70 

a-strongly monotone map, 368 

ambiguity, 50 

Annihilating polynomial, 58 

Applicability of centralities, 288 

Asymptotic expansion, 124, 129, 132, 136, 
147 


B 

Banach contraction principle, 371 

Banach G-contraction, 398 

Betweenness centrality, 291 

Binary classification, 356 

Biological networks, 279 

Blockwise inverse, 233 

Blockwise inversion, 243, 257 

Bounded dimension homogeneous central- 
izers (BDHC), 66 


Cc 

Cancer genes, 277 

Cauchy sequence, 39 

Center, 104 

Centrality ranking, 284 

Centralizer, 66, 67 

Clifford algebra, 9 
matrix representation, | 1 

closed set, 38 

Closeness centrality, 290 

Coincidence point, 400 

Commutant, 97 


© Springer International Publishing Switzerland 2016 


Commutative subalgebra, 95 
maximal, 82, 96, 97, 99 
Commuting elements, 58, 70 
Complex Shannon wavelet, 338, 343 
Complex-Variable distribution, 344 
Condition 
perturbation, 174, 175, 177 
Control function, 357 
convergent sequence, 38 
Convolution, 339 
Crossed product algebra, 77, 97 
Cspan, 42 


D 
DCC in norm, 48 
Decision boundaries, 356 
Decision function, 379 
Decision plane, 356 
Decision problems 
P, NP, Co-N P, 316 
Degree, 66 
Degree centrality, 289 
Dense subset, 39 
Descending chain 
condition 
in norm, 48 
Differential calculus 
coordinate first order, 18 
first order, 18 
Differential operator ring, 58, 66 
Dirac bracket notation, 339 
DIS, 50 
Distribution 
initial, 173, 176 
stationary, 178 


S. Silvestrov and M. Ranéié (eds.), Engineering Mathematics I, 
Springer Proceedings in Mathematics & Statistics 179, 


DOI 10.1007/978-3-319-42105-6 


433 


434 


Down-set 

ideal section, 50 
down-set module, 47 
Dynamic programming, 320 


E 
Element (in R(X)) 
irreducible, 46 
persistently reducible, 46 
stuck in, 46 
uniquely reducible, 46 
Eliminant, 60 
Eliminant construction for Ore extension, 61 
Enrichment of cancer genes, 296 
B,(b) (e-neighbourhood for b € R(X)), 46 
e-chainable metric space, 402 
Equicontinuous family of maps, 39 
Euler Beta function, 348 
Expansion 
Laurent asymptotic, 155 
pivotal, 155 
Taylor asymptotic, 155 


F 
F-contraction mapping, 420 
multivalued generalized, 420 
Feature space, 378 
Filippov identity, 3 
First hitting time, 110, 112, 132-134 
First order differential calculus, 18 
Fixed point, 367, 394, 400 
of a multivalued mapping, 421 
Fractional derivative (distribution sense) 
on C, 348, 349 
on R, 348 
Fractional derivative of complex wavelets 
Gabor—Morlet, 350-352 
Functional annotations, 301 
Functional margin, 356, 380 


G 
Gabor-Morlet wavelet, 338, 350, 352 
Galois extension 
noncommutative, 15 
semi-commutative, 15 
Generalized Clifford algebra 
matrix representation, 26 
Generalized support vector machine, 356, 
358, 387 
Generalized variational inequality, 358, 393 
Geometric margin, 357, 380 
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Graded N-differential, 17 
Graded g-commutator, 16 
Graded q-derivation, 16 
Graded q-differential algebra, 17 
Graph 
complete, 227, 236, 254 
connected, 398 
directed, 398 
simple line, 227, 230, 254 
weakly connected, 398 
weighted, 399 
Graph ¢-contraction, 400 


H 

Hamel basis, 42 

Heaviside step function, 338 
Heisenberg algebra, 58 
Hilbert basis, 42 


I 

Ideal Z(S) (of rewrite system), 43 

Induced n-Lie superalgebra, 4 

Irr (set of irreducible elements in R(X)), 46 


J 
Just-in-time production systems (JITPS), 
313 


K 

Kernel 
inverse multi-quadratic kernel, 380 
kernel representation, 378 
linear kernel, 379 
polynomial kernel, 379 
radial basis function kernel, 379 
sigmoid kernel, 379 


L 
Length 
of asymptotic expansion, 171 
Lie superalgebra, 2 
homogeneous elements, 3 
parity, 3 
limit, 38 
limit point, 38 
Linear classifier, 356 
Linear combinations, 293 
Linear learning machine, 378 
L?-inner product, 338, 339 
L-Lipschitz map, 367 


Index 


M 
Marcinkiewicz interpolation theorem, 332 
Markov chain, 110, 111, 133, 134, 144, 145 
continuous, 176 
discrete, 176 
embedded, 176 
perturbed, 173 
Markov renewal process, 111, 134 
Module norm, 37 
Monotone map, 368 
Multivalued selfmap, 402 


N 
n-ary multiplication law, 2 
n-Lie algebra, 1, 2 
n-Lie superalgebra, 2, 3 
derived series, 4 
descending central series, 4 
ideal, 4 
nilpotent, 4 
solvable, 4 
structure constants, 7 
Nonexpansive map, 394 
Nonlinear integer programming problem, 
316 
Non-linear learning machine, 379 
norm, 37 
normal form, 46 
Normed 
R-algebra, 37 
R-module, 37 
ring, 37 


O 

Objective function, 357, 380 

open subset, 38 

Optimal hyperplane, 356 

Ordered complete metric space, 427 

Ore extension, 58, 66 

Orthogonal submodule, 42 

to a submodule, 42 

Ortigueira-Caputo fractional derivative, 338, 
390, 352 

Output rate variation problem (ORVP), 313 

—, topological closure, 42 


P 

PageRank, 227, 251 

Partially ordered metric space, 421 
Pegging assumption, 320 
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Perturbation, 109, 123, 132, 133, 135 
o-trace, 3 
Pompeiu-Hausdorff metric, 399 
Pompeiu-Hausdorff weight, 399 
Positive definite distributions, 338, 340-342, 
5o2 
Probability 
non-absorption, 184 
transition, 173, 176, 181 
Process 
perturbed Markov renewal, 176 
perturbed semi-Markov, 176 
reduced semi-Markoy, 181 
Production rate variation problem (PRVP), 
313 
Pseudo-degree, 68 
Pseudo-degree function, 72 
w\-contraction pair, 401 
w2-contraction pair, 401 


Q 
q-deformed Wey] algebra, 58 
Quadratic programming problem, 356 
Quasi-stationary distribution, 110, 131-133, 
135-137, 139, 145 
Quaternion algebra, 22 
Quotient 
norm, 41 
q-Wey] algebra, 58 


R 
R-algebra norm, 37 
Random walk, 229, 261 
Reconstruction formula, 340, 343 
Red(S), 46 
Reduction, 45 
acting trivially, 45 
simple, 45 
Renewal equation, 110, 133, 136 
Resolvable 
relative to, 51 
Rewrite rule, 43 
compatible, 47 
Rewrite system, 43 
Rewriting system 
confluent, 46 
Riesz—Thorin theorem, 332 
ring norm, 37 
ring with norm, 37 
Rule (for Laurent asymptotic expansions) 
division, 159, 161 


436 


multiple multiplication, 163, 164 
multiple summation, 162, 163 
multiplication, 158, 160 
multiplication by a constant, 158, 159 
operational, 158, 159, 162 
reciprocal, 158, 160 
summation, 158, 159 

Rule (for rewrite system), 43 


S 


Schwartz theorem, 341 


Semi-Markov process, 109-111, 132-134, 


144 
semigroup partial order, 47 
Set, 174 
orthogonal, 42 
topologically linearly independent, 42 
transition, 174 
sinc function, 338 
Solidarity property, 119 
Space 
phase, 173, 176 
reduced phase, 180 
Span, 42 
Strictly monotone map, 368 
Support vector classification, 356 
Support vector machine, 355 


T 
Tempered distributions, 337, 338, 340 


Index 


Time 
hitting, 179 
Topological closure, 39 
Topologically complete, 39 
Topological properties, 282 
Trivial norm, 37 
T (S) (set of reductions), 45 
Twisted convolution, 76, 96 
Twisted Leibniz rule for derivatives, 20 


U 
Ultranorm, 40 
Upper semicontinuous mapping, 421 


Vv 
Valuation, 40, 68 
v-degree norm, 38 


Ww 
Wavelet expansion 
of positive definite distributions, 341 
on -%(R), 341 
Wavelet transform of distributions, 339 
Weyl algebra, 58 


x 
&-impulse, 345-347 


